MyArxiv
Robotics
Cross-Category Functional Grasp Transfer
Generating grasps for a dexterous hand often requires numerous grasping annotations. However, annotating high DoF dexterous hand poses is quite challenging. Especially for functional grasps, requiring the hand to grasp the object in a specific pose to facilitate subsequent manipulations. This prompts us to explore how people achieve manipulations on new objects based on past grasp experiences. We find that when grasping new items, people are adept at discovering and leveraging various similarities between objects, including shape, layout, and grasp type. Considering this, we analyze and collect grasp-related similarity relationships among 51 common tool-like object categories and annotate semantic grasp representation for 1768 objects. These objects are connected through similarities to form a knowledge graph, which helps infer our proposed cross-category functional grasp synthesis. Through extensive experiments, we demonstrate that the grasp-related knowledge indeed contributed to achieving functional grasp transfer across unknown or entirely new categories of objects.
Guiding Reinforcement Learning with Incomplete System Dynamics IROS 2024
Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness.
comment: Accepted to IROS 2024
Systems and Control (CS)
Effective Finite Time Stability Control for Human-Machine Shared Vehicle Following System
With the development of intelligent connected vehicle technology, human-machine shared control has gained popularity in vehicle following due to its effectiveness in driver assistance. However, traditional vehicle following systems struggle to maintain stability when driver reaction time fluctuates, as these variations require different levels of system intervention. To address this issue, the proposed human-machine shared vehicle following assistance system (HM-VFAS) integrates driver outputs under various states with the assistance system. The system employs an intelligent driver model that accounts for reaction time delays, simulating time-varying driver outputs. A control authority allocation strategy is designed to dynamically adjust the level of intervention based on real-time driver state assessment. To handle instability from driver authority switching, the proposed solution includes a two-layer adaptive finite time sliding mode controller (A-FTSMC). The first layer is an integral sliding mode adaptive controller that ensures robustness by compensating for uncertainties in the driver output. The second layer is a fast non-singular terminal sliding mode controller designed to accelerate convergence for rapid stabilization. Using real driver videos as inputs, the performance of the HM-VFAS was evaluated. Results show that the proposed control strategy maintains a safe distance under time-varying driver states, with the actual acceleration error relative to the target acceleration maintained within 0.5m/s~2 and the maximum acceleration error reduced by 1.2m/s~2. Compared to traditional controllers, the A-FTSMC controller offers faster convergence and less vibration, reducing the stabilization time by 27.3%.
Exploiting Data Centres and Local Energy Communities Synergies for Market Participation
The evolving energy landscape has propelled energy communities to the forefront of modern energy management. However, existing research has yet to explore the potential synergies between data centres and energy communities, necessitating an assessment on their collective capabilities for cost efficiency, waste heat optimisation, and market participation. This paper presents a mixed integer linear programming model to assess the collaborative performance of energy communities, data centres and energy markets. The evaluation focuses on the efficient use of waste heat and the flexibility of job scheduling while minimising system energy costs and maintaining quality of service requirements for data centres. Our results, based on realistic profiles of an energy community and a data centre, showcase significant benefits of these synergies, with a 38% reduction in operating costs and an 87% decrease in heat demand.
comment: Accepted at IEEE PES ISGT Europe 2024
Guiding Reinforcement Learning with Incomplete System Dynamics IROS 2024
Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness.
comment: Accepted to IROS 2024
Systems and Control (EESS)
Effective Finite Time Stability Control for Human-Machine Shared Vehicle Following System
With the development of intelligent connected vehicle technology, human-machine shared control has gained popularity in vehicle following due to its effectiveness in driver assistance. However, traditional vehicle following systems struggle to maintain stability when driver reaction time fluctuates, as these variations require different levels of system intervention. To address this issue, the proposed human-machine shared vehicle following assistance system (HM-VFAS) integrates driver outputs under various states with the assistance system. The system employs an intelligent driver model that accounts for reaction time delays, simulating time-varying driver outputs. A control authority allocation strategy is designed to dynamically adjust the level of intervention based on real-time driver state assessment. To handle instability from driver authority switching, the proposed solution includes a two-layer adaptive finite time sliding mode controller (A-FTSMC). The first layer is an integral sliding mode adaptive controller that ensures robustness by compensating for uncertainties in the driver output. The second layer is a fast non-singular terminal sliding mode controller designed to accelerate convergence for rapid stabilization. Using real driver videos as inputs, the performance of the HM-VFAS was evaluated. Results show that the proposed control strategy maintains a safe distance under time-varying driver states, with the actual acceleration error relative to the target acceleration maintained within 0.5m/s~2 and the maximum acceleration error reduced by 1.2m/s~2. Compared to traditional controllers, the A-FTSMC controller offers faster convergence and less vibration, reducing the stabilization time by 27.3%.
Exploiting Data Centres and Local Energy Communities Synergies for Market Participation
The evolving energy landscape has propelled energy communities to the forefront of modern energy management. However, existing research has yet to explore the potential synergies between data centres and energy communities, necessitating an assessment on their collective capabilities for cost efficiency, waste heat optimisation, and market participation. This paper presents a mixed integer linear programming model to assess the collaborative performance of energy communities, data centres and energy markets. The evaluation focuses on the efficient use of waste heat and the flexibility of job scheduling while minimising system energy costs and maintaining quality of service requirements for data centres. Our results, based on realistic profiles of an energy community and a data centre, showcase significant benefits of these synergies, with a 38% reduction in operating costs and an 87% decrease in heat demand.
comment: Accepted at IEEE PES ISGT Europe 2024
Guiding Reinforcement Learning with Incomplete System Dynamics IROS 2024
Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness.
comment: Accepted to IROS 2024
Robotics
DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes
LiDAR scene generation has been developing rapidly recently. However, existing methods primarily focus on generating static and single-frame scenes, overlooking the inherently dynamic nature of real-world driving environments. In this work, we introduce DynamicCity, a novel 4D LiDAR generation framework capable of generating large-scale, high-quality LiDAR scenes that capture the temporal evolution of dynamic environments. DynamicCity mainly consists of two key models. 1) A VAE model for learning HexPlane as the compact 4D representation. Instead of using naive averaging operations, DynamicCity employs a novel Projection Module to effectively compress 4D LiDAR features into six 2D feature maps for HexPlane construction, which significantly enhances HexPlane fitting quality (up to 12.56 mIoU gain). Furthermore, we utilize an Expansion & Squeeze Strategy to reconstruct 3D feature volumes in parallel, which improves both network training efficiency and reconstruction accuracy than naively querying each 3D point (up to 7.05 mIoU gain, 2.06x training speedup, and 70.84% memory reduction). 2) A DiT-based diffusion model for HexPlane generation. To make HexPlane feasible for DiT generation, a Padded Rollout Operation is proposed to reorganize all six feature planes of the HexPlane as a squared 2D feature map. In particular, various conditions could be introduced in the diffusion or sampling process, supporting versatile 4D generation applications, such as trajectory- and command-driven generation, inpainting, and layout-conditioned generation. Extensive experiments on the CarlaSC and Waymo datasets demonstrate that DynamicCity significantly outperforms existing state-of-the-art 4D LiDAR generation methods across multiple metrics. The code will be released to facilitate future research.
comment: Preprint; 29 pages, 15 figures, 7 tables; Project Page at https://dynamic-city.github.io/
SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation
Robot learning has proven to be a general and effective technique for programming manipulators. Imitation learning is able to teach robots solely from human demonstrations but is bottlenecked by the capabilities of the demonstrations. Reinforcement learning uses exploration to discover better behaviors; however, the space of possible improvements can be too large to start from scratch. And for both techniques, the learning difficulty increases proportional to the length of the manipulation task. Accounting for this, we propose SPIRE, a system that first uses Task and Motion Planning (TAMP) to decompose tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths. We develop novel strategies to train learning agents when deployed in the context of a planning system. We evaluate SPIRE on a suite of long-horizon and contact-rich robot manipulation problems. We find that SPIRE outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance, is 6 times more data efficient in the number of human demonstrations needed to train proficient agents, and learns to complete tasks nearly twice as efficiently. View https://sites.google.com/view/spire-corl-2024 for more details.
comment: Conference on Robot Learning (CoRL) 2024
A Pipeline for Segmenting and Structuring RGB-D Data for Robotics Applications
We introduce a novel pipeline for segmenting and structuring color and depth (RGB-D) data. Existing processing pipelines for RGB-D data have focused on extracting geometric information alone. This approach precludes the development of more advanced robotic navigation and manipulation algorithms, which benefit from a semantic understanding of their environment. Our pipeline can segment RGB-D data into accurate semantic masks. These masks are then used to fuse raw captured point clouds into semantically separated point clouds. We store this information using the Universal Scene Description (USD) file format, a format suitable for easy querying by downstream robotics algorithms, human-friendly visualization, and robotics simulation.
Robust Two-View Geometry Estimation with Implicit Differentiation IROS 2024
We present a novel two-view geometry estimation framework which is based on a differentiable robust loss function fitting. We propose to treat the robust fundamental matrix estimation as an implicit layer, which allows us to avoid backpropagation through time and significantly improves the numerical stability. To take full advantage of the information from the feature matching stage we incorporate learnable weights that depend on the matching confidences. In this way our solution brings together feature extraction, matching and two-view geometry estimation in a unified end-to-end trainable pipeline. We evaluate our approach on the camera pose estimation task in both outdoor and indoor scenarios. The experiments on several datasets show that the proposed method outperforms both classic and learning-based state-of-the-art methods by a large margin. The project webpage is available at: https://github.com/VladPyatov/ihls
comment: IROS 2024 Accepted
Reconfigurable Hydrostatics: Toward Multifunctional and Powerful Wearable Robotics
Wearable and locomotive robot designers face multiple challenges when choosing actuation. Traditional fully actuated designs using electric motors are multifunctional but oversized and inefficient for bearing conservative loads and for being backdrivable. Alternatively, quasi-passive and underactuated designs reduce the size of motorization and energy storage, but are often designed for specific tasks. Designers of versatile and stronger wearable robots will face these challenges unless future actuators become very torque-dense, backdrivable and efficient. This paper explores a design paradigm for addressing this issue: reconfigurable hydrostatics. We show that a hydrostatic actuator can integrate a passive force mechanism and a sharing mechanism in the fluid domain and still be multifunctional. First, an analytical study compares how these two mechanisms can relax the motorization requirements in the context of a load-bearing exoskeleton. Then, the hydrostatic concept integrating these two mechanisms using hydraulic components is presented. A case study analysis shows the mass/efficiency/inertia benefits of the concept over a fully actuated one. Then, the feasibility of the concept is partially validated with a proof-of-concept that actuates the knees of an exoskeleton. The experiments show that it can track the vertical ground reaction force (GRF) profiles of walking, running, squatting, and jumping, and that the energy consumption is 6x lower. The transient force behaviors due to switching from one leg to the other are also analyzed along with some mitigation to improve them.
Gaussian Process Distance Fields Obstacle and Ground Constraints for Safe Navigation
Navigating cluttered environments is a challenging task for any mobile system. Existing approaches for ground-based mobile systems primarily focus on small wheeled robots, which face minimal constraints with overhanging obstacles and cannot manage steps or stairs, making the problem effectively 2D. However, navigation for legged robots (or even humans) has to consider an extra dimension. This paper proposes a tailored scene representation coupled with an advanced trajectory optimisation algorithm to enable safe navigation. Our 3D navigation approach is suitable for any ground-based mobile robot, whether wheeled or legged, as well as for human assistance. Given a 3D point cloud of the scene and the segmentation of the ground and non-ground points, we formulate two Gaussian Process distance fields to ensure a collision-free path and maintain distance to the ground constraints. Our method adeptly handles uneven terrain, steps, and overhanging objects through an innovative use of a quadtree structure, constructing a multi-resolution map of the free space and its connectivity graph based on a 2D projection of the relevant scene. Evaluations with both synthetic and real-world datasets demonstrate that this approach provides safe and smooth paths, accommodating a wide range of ground-based mobile systems.
Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models
A central challenge towards developing robots that can relate human language to their perception and actions is the scarcity of natural language annotations in diverse robot datasets. Moreover, robot policies that follow natural language instructions are typically trained on either templated language or expensive human-labeled instructions, hindering their scalability. To this end, we introduce NILS: Natural language Instruction Labeling for Scalability. NILS automatically labels uncurated, long-horizon robot data at scale in a zero-shot manner without any human intervention. NILS combines pretrained vision-language foundation models in order to detect objects in a scene, detect object-centric changes, segment tasks from large datasets of unlabelled interaction data and ultimately label behavior datasets. Evaluations on BridgeV2, Fractal, and a kitchen play dataset show that NILS can autonomously annotate diverse robot demonstrations of unlabeled and unstructured datasets while alleviating several shortcomings of crowdsourced human annotations, such as low data quality and diversity. We use NILS to label over 115k trajectories obtained from over 430 hours of robot data. We open-source our auto-labeling code and generated annotations on our website: http://robottasklabeling.github.io.
comment: Project Website at https://robottasklabeling.github.io/
Multi-Layered Safety of Redundant Robot Manipulators via Task-Oriented Planning and Control
Ensuring safety is crucial to promote the application of robot manipulators in open workspace. Factors such as sensor errors or unpredictable collisions make the environment full of uncertainties. In this work, we investigate these potential safety challenges on redundant robot manipulators, and propose a task-oriented planning and control framework to achieve multi-layered safety while maintaining efficient task execution. Our approach consists of two main parts: a task-oriented trajectory planner based on multiple-shooting model predictive control method, and a torque controller that allows safe and efficient collision reaction using only proprioceptive data. Through extensive simulations and real-hardware experiments, we demonstrate that the proposed framework can effectively handle uncertain static or dynamic obstacles, and perform disturbance resistance in manipulation tasks when unforeseen contacts occur. All code will be open-sourced to benefit the community.
comment: 7 pages, 8 figures. This work has been submitted to the IEEE for possible publication
Towards Safer Planetary Exploration: A Hybrid Architecture for Terrain Traversability Analysis in Mars Rovers
The field of autonomous navigation for unmanned ground vehicles (UGVs) is in continuous growth and increasing levels of autonomy have been reached in the last few years. However, the task becomes more challenging when the focus is on the exploration of planet surfaces such as Mars. In those situations, UGVs are forced to navigate through unstable and rugged terrains which, inevitably, open the vehicle to more hazards, accidents, and, in extreme cases, complete mission failure. The paper addresses the challenges of autonomous navigation for unmanned ground vehicles in planetary exploration, particularly on Mars, introducing a hybrid architecture for terrain traversability analysis that combines two approaches: appearance-based and geometry-based. The appearance-based method uses semantic segmentation via deep neural networks to classify different terrain types. This is further refined by pixel-level terrain roughness classification obtained from the same RGB image, assigning different costs based on the physical properties of the soil. The geometry-based method complements the appearance-based approach by evaluating the terrain's geometrical features, identifying hazards that may not be detectable by the appearance-based side. The outputs of both methods are combined into a comprehensive hybrid cost map. The proposed architecture was trained on synthetic datasets and developed as a ROS2 application to integrate into broader autonomous navigation systems for harsh environments. Simulations have been performed in Unity, showing the ability of the method to assess online traversability analysis.
Markov Potential Game with Final-time Reach-Avoid Objectives
We formulate a Markov potential game with final-time reach-avoid objectives by integrating potential game theory with stochastic reach-avoid control. Our focus is on multi-player trajectory planning where players maximize the same multi-player reach-avoid objective: the probability of all participants reaching their designated target states by a specified time, while avoiding collisions with one another. Existing approaches require centralized computation of actions via a global policy, which may have prohibitively expensive communication costs. Instead, we focus on approximations of the global policy via local state feedback policies. First, we adapt the recursive single player reach-avoid value iteration to the multi-player framework with local policies, and show that the same recursion holds on the joint state space. To find each player's optimal local policy, the multi-player reach-avoid value function is projected from the joint state to the local state using the other players' occupancy measures. Then, we propose an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium. We demonstrate the utility of our approach in finding collision-free policies for multi-player motion planning in simulation.
comment: 8 pages, 2 figures
Human-Robot Collaboration System Setup for Weed Harvesting Scenarios in Aquatic Lakes IROS 2024
Artificial Water Bodies (AWBs) are human-made and require continuous monitoring due to their artificial biological processes. These systems necessitate regular maintenance to manage their ecosystems effectively. Unmanned Surface Vehicle (USV) offers a collaborative approach for monitoring these environments, working alongside human operators such as boat skippers to identify specific locations. This paper discusses a weed harvesting scenario, demonstrating how human-robot collaboration can be achieved, supported by preliminary results. The USV mainly utilises multibeam SOund NAvigation and Ranging (SONAR) for underwater weed monitoring, showing promising outcomes in these scenarios.
comment: 3 pages, 5 figures. This paper was accepted for poster presentation at IROS 2024 Workshop on Maritime Heteregenous Unmanned Robotic Systems (MHURS)
Incremental Learning of Affordances using Markov Logic Networks
Affordances enable robots to have a semantic understanding of their surroundings. This allows them to have more acting flexibility when completing a given task. Capturing object affordances in a machine learning model is a difficult task, because of their dependence on contextual information. Markov Logic Networks (MLN) combine probabilistic reasoning with logic that is able to capture such context. Mobile robots operate in partially known environments wherein unseen object affordances can be observed. This new information must be incorporated into the existing knowledge, without having to retrain the MLN from scratch. We introduce the MLN Cumulative Learning Algorithm (MLN-CLA). MLN-CLA learns new relations in various knowledge domains by retaining knowledge and only updating the changed knowledge, for which the MLN is retrained. We show that MLN-CLA is effective for accumulative learning and zero-shot affordance inference, outperforming strong baselines.
comment: accepted at IEEE IRC 2024
ImDy: Human Inverse Dynamics from Imitated Observations
Inverse dynamics (ID), which aims at reproducing the driven torques from human kinematic observations, has been a critical tool for gait analysis. However, it is hindered from wider application to general motion due to its limited scalability. Conventional optimization-based ID requires expensive laboratory setups, restricting its availability. To alleviate this problem, we propose to exploit the recently progressive human motion imitation algorithms to learn human inverse dynamics in a data-driven manner. The key insight is that the human ID knowledge is implicitly possessed by motion imitators, though not directly applicable. In light of this, we devise an efficient data collection pipeline with state-of-the-art motion imitation algorithms and physics simulators, resulting in a large-scale human inverse dynamics benchmark as Imitated Dynamics (ImDy). ImDy contains over 150 hours of motion with joint torque and full-body ground reaction force data. With ImDy, we train a data-driven human inverse dynamics solver ImDyS(olver) in a fully supervised manner, which conducts ID and ground reaction force estimation simultaneously. Experiments on ImDy and real-world data demonstrate the impressive competency of ImDyS in human inverse dynamics and ground reaction force estimation. Moreover, the potential of ImDy(-S) as a fundamental motion analysis tool is exhibited with downstream applications. The project page is https://foruck.github.io/ImDy/.
comment: Yong-Lu Li and Cewu Lu are the corresponding authors
Integrating Large Language Models for UAV Control in Simulated Environments: A Modular Interaction Approach
The intersection of LLMs (Large Language Models) and UAV (Unoccupied Aerial Vehicles) technology represents a promising field of research with the potential to enhance UAV capabilities significantly. This study explores the application of LLMs in UAV control, focusing on the opportunities for integrating advanced natural language processing into autonomous aerial systems. By enabling UAVs to interpret and respond to natural language commands, LLMs simplify the UAV control and usage, making them accessible to a broader user base and facilitating more intuitive human-machine interactions. The paper discusses several key areas where LLMs can impact UAV technology, including autonomous decision-making, dynamic mission planning, enhanced situational awareness, and improved safety protocols. Through a comprehensive review of current developments and potential future directions, this study aims to highlight how LLMs can transform UAV operations, making them more adaptable, responsive, and efficient in complex environments. A template development framework for integrating LLMs in UAV control is also described. Proof of Concept results that integrate existing LLM models and popular robotic simulation platforms are demonstrated. The findings suggest that while there are substantial technical and ethical challenges to address, integrating LLMs into UAV control holds promising implications for advancing autonomous aerial systems.
Energy-Optimal Planning of Waypoint-Based UAV Missions -- Does Minimum Distance Mean Minimum Energy? IROS
Multirotor unmanned aerial vehicle is a prevailing type of aerial robots with wide real-world applications. The energy efficiency of the robot is a critical aspect of its performance, determining the range and duration of the missions that can be performed. This paper studies the energy-optimal planning of the multirotor, which aims at finding the optimal ordering of waypoints with the minimum energy consumption for missions in 3D space. The study is performed based on a previously developed model capturing first-principle energy dynamics of the multirotor. We found that in majority of the cases (up to 95%) the solutions of the energy-optimal planning are different from those of the traditional traveling salesman problem which minimizes the total distance. The difference can be as high as 14.9%, with the average at 1.6%-3.3% and 90th percentile at 3.7%-6.5% depending on the range and number of waypoints in the mission. We then identified and explained the key features of the minimum-energy order by correlating to the underlying flight energy dynamics. It is shown that instead of minimizing the distance, coordination of vertical and horizontal motion to promote aerodynamic efficiency is the key to optimizing energy consumption.
comment: This paper has been accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024
Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System through Distributed Database and Multimodal Perception: Demonstrated in Crossroads
The autonomous driving industry is rapidly advancing, with Vehicle-to-Vehicle (V2V) communication systems highlighting as a key component of enhanced road safety and traffic efficiency. This paper introduces a novel Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System (VVCCS), designed to revolutionize macro-scope traffic planning and collision avoidance in autonomous driving. Implemented on Quanser Car (Qcar) hardware platform, our system integrates the distributed databases into individual autonomous vehicles and an optional central server. We also developed a comprehensive multi-modal perception system with multi-objective tracking and radar sensing. Through a demonstration within a physical crossroad environment, our system showcases its potential to be applied in congested and complex urban environments.
comment: ICICT 2024, 18 pages
Multimodal Information Bottleneck for Deep Reinforcement Learning with Multiple Sensors
Reinforcement learning has achieved promising results on robotic control tasks but struggles to leverage information effectively from multiple sensory modalities that differ in many characteristics. Recent works construct auxiliary losses based on reconstruction or mutual information to extract joint representations from multiple sensory inputs to improve the sample efficiency and performance of reinforcement learning algorithms. However, the representations learned by these methods could capture information irrelevant to learning a policy and may degrade the performance. We argue that compressing information in the learned joint representations about raw multimodal observations is helpful, and propose a multimodal information bottleneck model to learn task-relevant joint representations from egocentric images and proprioception. Our model compresses and retains the predictive information in multimodal observations for learning a compressed joint representation, which fuses complementary information from visual and proprioceptive feedback and meanwhile filters out task-irrelevant information in raw multimodal observations. We propose to minimize the upper bound of our multimodal information bottleneck objective for computationally tractable optimization. Experimental evaluations on several challenging locomotion tasks with egocentric images and proprioception show that our method achieves better sample efficiency and zero-shot robustness to unseen white noise than leading baselines. We also empirically demonstrate that leveraging information from egocentric images and proprioception is more helpful for learning policies on locomotion tasks than solely using one single modality.
comment: 31 pages
Generalizable Motion Planning via Operator Learning
In this work, we introduce a planning neural operator (PNO) for predicting the value function of a motion planning problem. We recast value function approximation as learning a single operator from the cost function space to the value function space, which is defined by an Eikonal partial differential equation (PDE). Specifically, we recast computing value functions as learning a single operator across continuous function spaces which prove is equivalent to solving an Eikonal PDE. Through this reformulation, our learned PNO is able to generalize to new motion planning problems without retraining. Therefore, our PNO model, despite being trained with a finite number of samples at coarse resolution, inherits the zero-shot super-resolution property of neural operators. We demonstrate accurate value function approximation at 16 times the training resolution on the MovingAI lab's 2D city dataset and compare with state-of-the-art neural value function predictors on 3D scenes from the iGibson building dataset. Lastly, we investigate employing the value function output of PNO as a heuristic function to accelerate motion planning. We show theoretically that the PNO heuristic is $\epsilon$-consistent by introducing an inductive bias layer that guarantees our value functions satisfy the triangle inequality. With our heuristic, we achieve a 30% decrease in nodes visited while obtaining near optimal path lengths on the MovingAI lab 2D city dataset, compared to classical planning methods (A*, RRT*).
Mechanisms and Computational Design of Multi-Modal End-Effector with Force Sensing using Gated Networks
In limbed robotics, end-effectors must serve dual functions, such as both feet for locomotion and grippers for grasping, which presents design challenges. This paper introduces a multi-modal end-effector capable of transitioning between flat and line foot configurations while providing grasping capabilities. MAGPIE integrates 8-axis force sensing using proposed mechanisms with hall effect sensors, enabling both contact and tactile force measurements. We present a computational design framework for our sensing mechanism that accounts for noise and interference, allowing for desired sensitivity and force ranges and generating ideal inverse models. The hardware implementation of MAGPIE is validated through experiments, demonstrating its capability as a foot and verifying the performance of the sensing mechanisms, ideal models, and gated network-based models.
X-MOBILITY: End-To-End Generalizable Navigation via World Modeling
General-purpose navigation in challenging environments remains a significant problem in robotics, with current state-of-the-art approaches facing myriad limitations. Classical approaches struggle with cluttered settings and require extensive tuning, while learning-based methods face difficulties generalizing to out-of-distribution environments. This paper introduces X-Mobility, an end-to-end generalizable navigation model that overcomes existing challenges by leveraging three key ideas. First, X-Mobility employs an auto-regressive world modeling architecture with a latent state space to capture world dynamics. Second, a diverse set of multi-head decoders enables the model to learn a rich state representation that correlates strongly with effective navigation skills. Third, by decoupling world modeling from action policy, our architecture can train effectively on a variety of data sources, both with and without expert policies: off-policy data allows the model to learn world dynamics, while on-policy data with supervisory control enables optimal action policy learning. Through extensive experiments, we demonstrate that X-Mobility not only generalizes effectively but also surpasses current state-of-the-art navigation approaches. Additionally, X-Mobility also achieves zero-shot Sim2Real transferability and shows strong potential for cross-embodiment generalization.
GenDP: 3D Semantic Fields for Category-Level Generalizable Diffusion Policy
Diffusion-based policies have shown remarkable capability in executing complex robotic manipulation tasks but lack explicit characterization of geometry and semantics, which often limits their ability to generalize to unseen objects and layouts. To enhance the generalization capabilities of Diffusion Policy, we introduce a novel framework that incorporates explicit spatial and semantic information via 3D semantic fields. We generate 3D descriptor fields from multi-view RGBD observations with large foundational vision models, then compare these descriptor fields against reference descriptors to obtain semantic fields. The proposed method explicitly considers geometry and semantics, enabling strong generalization capabilities in tasks requiring category-level generalization, resolving geometric ambiguities, and attention to subtle geometric details. We evaluate our method across eight tasks involving articulated objects and instances with varying shapes and textures from multiple object categories. Our method demonstrates its effectiveness by increasing Diffusion Policy's average success rate on unseen instances from 20% to 93%. Additionally, we provide a detailed analysis and visualization to interpret the sources of performance gain and explain how our method can generalize to novel instances.
comment: Accepted to Conference on Robot Learning (CoRL 2024). Project Page: https://robopil.github.io/GenDP/
Diffusion-Reward Adversarial Imitation Learning
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a generator policy learning to imitate expert behaviors and discriminator learning to distinguish the expert demonstrations from agent trajectories. Despite its encouraging results, GAIL training is often brittle and unstable. Inspired by the recent dominance of diffusion models in generative modeling, we propose Diffusion-Reward Adversarial Imitation Learning (DRAIL), which integrates a diffusion model into GAIL, aiming to yield more robust and smoother rewards for policy learning. Specifically, we propose a diffusion discriminative classifier to construct an enhanced discriminator, and design diffusion rewards based on the classifier's output for policy learning. Extensive experiments are conducted in navigation, manipulation, and locomotion, verifying DRAIL's effectiveness compared to prior imitation learning methods. Moreover, additional experimental results demonstrate the generalizability and data efficiency of DRAIL. Visualized learned reward functions of GAIL and DRAIL suggest that DRAIL can produce more robust and smoother rewards. Project page: https://nturobotlearninglab.github.io/DRAIL/
Flying through Moving Gates without Full State Estimation
Autonomous drone racing requires powerful perception, planning, and control and has become a benchmark and test field for autonomous, agile flight. Existing work usually assumes static race tracks with known maps, which enables offline planning of time-optimal trajectories, performing localization to the gates to reduce the drift in visual-inertial odometry (VIO) for state estimation or training learning-based methods for the particular race track and operating environment. In contrast, many real-world tasks like disaster response or delivery need to be performed in unknown and dynamic environments. To close this gap and make drone racing more robust against unseen environments and moving gates, we propose a control algorithm that does not require a race track map or VIO and uses only monocular measurements of the line of sight (LOS) to the gates. For this purpose, we adopt the law of proportional navigation (PN) to accurately fly through the gates despite gate motions or wind. We formulate the PN-informed vision-based control problem for drone racing as a constrained optimization problem and derive a closed-form optimal solution. We demonstrate through extensive simulations and real-world experiments that our method can navigate through moving gates at high speeds while being robust to different gate movements, model errors, wind, and delays.
comment: 7 pages, 6 figures
The Art of Imitation: Learning Long-Horizon Manipulation Tasks from Few Demonstrations
Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for learning object-centric robot manipulation tasks. However, there are several open challenges to applying TP-GMMs in the wild. In this work, we tackle three crucial challenges synergistically. First, end-effector velocities are non-Euclidean and thus hard to model using standard GMMs. We thus propose to factorize the robot's end-effector velocity into its direction and magnitude, and model them using Riemannian GMMs. Second, we leverage the factorized velocities to segment and sequence skills from complex demonstration trajectories. Through the segmentation, we further align skill trajectories and hence leverage time as a powerful inductive bias. Third, we present a method to automatically detect relevant task parameters per skill from visual observations. Our approach enables learning complex manipulation tasks from just five demonstrations while using only RGB-D observations. Extensive experimental evaluations on RLBench demonstrate that our approach achieves state-of-the-art performance with 20-fold improved sample efficiency. Our policies generalize across different environments, object instances, and object positions, while the learned skills are reusable.
JointMotion: Joint Self-Supervision for Joint Motion Prediction
We present JointMotion, a self-supervised pre-training method for joint motion prediction in self-driving vehicles. Our method jointly optimizes a scene-level objective connecting motion and environments, and an instance-level objective to refine learned representations. Scene-level representations are learned via non-contrastive similarity learning of past motion sequences and environment context. At the instance level, we use masked autoencoding to refine multimodal polyline representations. We complement this with an adaptive pre-training decoder that enables JointMotion to generalize across different environment representations, fusion mechanisms, and dataset characteristics. Notably, our method reduces the joint final displacement error of Wayformer, HPTR, and Scene Transformer models by 3\%, 8\%, and 12\%, respectively; and enables transfer learning between the Waymo Open Motion and the Argoverse 2 Motion Forecasting datasets. Code: https://github.com/kit-mrt/future-motion
comment: CoRL'24 camera-ready
UniSaT: Unified-Objective Belief Model and Planner to Search for and Track Multiple Objects SC
Path planning for autonomous search and tracking of multiple objects is a critical problem in applications such as reconnaissance, surveillance, and data gathering. Due to the inherent competing objectives of searching for new objects while maintaining tracks for found objects, most current approaches rely on multi-objective planning methods, leaving it up to the user to tune parameters to balance between the two objectives, usually based on heuristics or trial and error. In this paper, we introduce UniSaT (Unified Search and Track), a novel unified-objective formulation for the search and track problem based on Random Finite Sets (RFS). Our approach models unknown and known objects using a combined generalized labeled multi-Bernoulli (GLMB) filter. For unseen objects, UniSaT leverages both cardinality and spatial prior distributions, allowing it to operate without prior knowledge of the exact number of objects in the search space. The planner maximizes the mutual information of this unified belief model, creating balanced search and tracking behaviors. We demonstrate our work in a simulated environment, presenting both qualitative results and quantitative improvements over a multi-objective method.
comment: 13 pages, AIAA SCITECH 2025 Forum
DexGrasp-Diffusion: Diffusion-based Unified Functional Grasp Synthesis Method for Multi-Dexterous Robotic Hands
The versatility and adaptability of human grasping catalyze advancing dexterous robotic manipulation. While significant strides have been made in dexterous grasp generation, current research endeavors pivot towards optimizing object manipulation while ensuring functional integrity, emphasizing the synthesis of functional grasps following desired affordance instructions. This paper addresses the challenge of synthesizing functional grasps tailored to diverse dexterous robotic hands by proposing DexGrasp-Diffusion, an end-to-end modularized diffusion-based method. DexGrasp-Diffusion integrates MultiHandDiffuser, a novel unified data-driven diffusion model for multi-dexterous hands grasp estimation, with DexDiscriminator, which employs a Physics Discriminator and a Functional Discriminator with open-vocabulary setting to filter physically plausible functional grasps based on object affordances. The experimental evaluation conducted on the MultiDex dataset provides substantiating evidence supporting the superior performance of MultiHandDiffuser over the baseline model in terms of success rate, grasp diversity, and collision depth. Moreover, we demonstrate the capacity of DexGrasp-Diffusion to reliably generate functional grasps for household objects aligned with specific affordance instructions.
comment: 15 pages, 5 figures
ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on Transformer IROS 2024
Obstacle detection and tracking represent a critical component in robot autonomous navigation. In this paper, we propose ODTFormer, a Transformer-based model to address both obstacle detection and tracking problems. For the detection task, our approach leverages deformable attention to construct a 3D cost volume, which is decoded progressively in the form of voxel occupancy grids. We further track the obstacles by matching the voxels between consecutive frames. The entire model can be optimized in an end-to-end manner. Through extensive experiments on DrivingStereo and KITTI benchmarks, our model achieves state-of-the-art performance in the obstacle detection task. We also report comparable accuracy to state-of-the-art obstacle tracking models while requiring only a fraction of their computation cost, typically ten-fold to twenty-fold less. The code and model weights will be publicly released.
comment: 8 pages. Accepted by IROS 2024
Cross-Category Functional Grasp Transfer
Generating grasps for a dexterous hand often requires numerous grasping annotations. However, annotating high DoF dexterous hand poses is quite challenging. Especially for functional grasps, requiring the hand to grasp the object in a specific pose to facilitate subsequent manipulations. This prompts us to explore how people achieve manipulations on new objects based on past grasp experiences. We find that when grasping new items, people are adept at discovering and leveraging various similarities between objects, including shape, layout, and grasp type. Considering this, we analyze and collect grasp-related similarity relationships among 51 common tool-like object categories and annotate semantic grasp representation for 1768 objects. These objects are connected through similarities to form a knowledge graph, which helps infer our proposed cross-category functional grasp synthesis. Through extensive experiments, we demonstrate that the grasp-related knowledge indeed contributed to achieving functional grasp transfer across unknown or entirely new categories of objects.
Gaussian-Informed Continuum for Physical Property Identification and Simulation NeurIPS 2024
This paper studies the problem of estimating physical properties (system identification) through visual observations. To facilitate geometry-aware guidance in physical property estimation, we introduce a novel hybrid framework that leverages 3D Gaussian representation to not only capture explicit shapes but also enable the simulated continuum to render object masks as 2D shape surrogates during training. We propose a new dynamic 3D Gaussian framework based on motion factorization to recover the object as 3D Gaussian point sets across different time states. Furthermore, we develop a coarse-to-fine filling strategy to generate the density fields of the object from the Gaussian reconstruction, allowing for the extraction of object continuums along with their surfaces and the integration of Gaussian attributes into these continuums. In addition to the extracted object surfaces, the Gaussian-informed continuum also enables the rendering of object masks during simulations, serving as 2D-shape guidance for physical property estimation. Extensive experimental evaluations demonstrate that our pipeline achieves state-of-the-art performance across multiple benchmarks and metrics. Additionally, we illustrate the effectiveness of the proposed method through real-world demonstrations, showcasing its practical utility. Our project page is at https://jukgei.github.io/project/gic.
comment: 21 pages, 8 figures, NeurIPS 2024
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning
Can we endow visuomotor robots with generalization capabilities to operate in diverse open-world scenarios? In this paper, we propose \textbf{Maniwhere}, a generalizable framework tailored for visual reinforcement learning, enabling the trained robot policies to generalize across a combination of multiple visual disturbance types. Specifically, we introduce a multi-view representation learning approach fused with Spatial Transformer Network (STN) module to capture shared semantic information and correspondences among different viewpoints. In addition, we employ a curriculum-based randomization and augmentation approach to stabilize the RL training process and strengthen the visual generalization ability. To exhibit the effectiveness of Maniwhere, we meticulously design 8 tasks encompassing articulate objects, bi-manual, and dexterous hand manipulation tasks, demonstrating Maniwhere's strong visual generalization and sim2real transfer abilities across 3 hardware platforms. Our experiments show that Maniwhere significantly outperforms existing state-of-the-art methods. Videos are provided at https://gemcollector.github.io/maniwhere/.
comment: Webpage: https://gemcollector.github.io/maniwhere/
Exploring Self-Supervised Skeleton-Based Human Action Recognition under Occlusions
To integrate self-supervised skeleton-based action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions. Such a scenario, despite its practical relevance, is rarely addressed in existing self-supervised skeleton-based action recognition methods. To empower models with the capacity to address occlusion, we propose a simple and effective method. We first pre-train using occluded skeleton sequences, then use k-means clustering (KMeans) on sequence embeddings to group semantically similar samples. Next, we propose KNN-Imputation to fill in missing skeleton data based on the closest sample neighbors. Imputing incomplete skeleton sequences to create relatively complete sequences as input provides significant benefits to existing skeleton-based self-supervised methods. Meanwhile, building on the state-of-the-art Partial Spatio-Temporal Learning (PSTL), we introduce an Occluded Partial Spatio-Temporal Learning (OPSTL) framework. This enhancement utilizes Adaptive Spatial Masking (ASM) for better use of high-quality, intact skeletons. The new proposed method is verified on the challenging occluded versions of the NTURGB+D 60 and NTURGB+D 120. The source code is publicly available at https://github.com/cyfml/OPSTL.
comment: The source code is publicly available at https://github.com/cyfml/OPSTL
Interactive Distance Field Mapping and Planning to Enable Human-Robot Collaboration
Human-robot collaborative applications require scene representations that are kept up-to-date and facilitate safe motions in dynamic scenes. In this letter, we present an interactive distance field mapping and planning (IDMP) framework that handles dynamic objects and collision avoidance through an efficient representation. We define interactive mapping and planning as the process of creating and updating the representation of the scene online while simultaneously planning and adapting the robot's actions based on that representation. The key aspect of this work is an efficient Gaussian Process field that performs incremental updates and handles dynamic objects reliably by identifying moving points via a simple and elegant formulation based on queries from a temporary latent model. In terms of mapping, IDMP is able to fuse point cloud data from single and multiple sensors, query the free space at any spatial resolution, and deal with moving objects without semantics. In terms of planning, IDMP allows seamless integration with gradient-based reactive planners facilitating dynamic obstacle avoidance for safe human-robot interactions. Our mapping performance is evaluated on both real and synthetic datasets. A comparison with similar state-of-the-art frameworks shows superior performance when handling dynamic objects and comparable or better performance in the accuracy of the computed distance and gradient field. Finally, we show how the framework can be used for fast motion planning in the presence of moving objects both in simulated and real-world scenes. An accompanying video, code, and datasets are made publicly available https://uts-ri.github.io/IDMP.
Real-World Robot Applications of Foundation Models: A Review
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-Language Models (VLMs), trained on extensive data, facilitate flexible application across different tasks and modalities. Their impact spans various fields, including healthcare, education, and robotics. This paper provides an overview of the practical application of foundation models in real-world robotics, with a primary emphasis on the replacement of specific components within existing robot systems. The summary encompasses the perspective of input-output relationships in foundation models, as well as their role in perception, motion planning, and control within the field of robotics. This paper concludes with a discussion of future challenges and implications for practical robot applications.
Log-GPIS-MOP: A Unified Representation for Mapping, Odometry and Planning
Whereas dedicated scene representations are required for each different task in conventional robotic systems, this paper demonstrates that a unified representation can be used directly for multiple key tasks. We propose the Log-Gaussian Process Implicit Surface for Mapping, Odometry and Planning (Log-GPIS-MOP): a probabilistic framework for surface reconstruction, localisation and navigation based on a unified representation. Our framework applies a logarithmic transformation to a Gaussian Process Implicit Surface (GPIS) formulation to recover a global representation that accurately captures the Euclidean distance field with gradients and, at the same time, the implicit surface. By directly estimating the distance field and its gradient through Log-GPIS inference, the proposed incremental odometry technique computes the optimal alignment of an incoming frame and fuses it globally to produce a map. Concurrently, an optimisation-based planner computes a safe collision-free path using the same Log-GPIS surface representation. We validate the proposed framework on simulated and real datasets in 2D and 3D and benchmark against the state-of-the-art approaches. Our experiments show that Log-GPIS-MOP produces competitive results in sequential odometry, surface mapping and obstacle avoidance.
The Dark Side of Rich Rewards: Understanding and Mitigating Noise in VLM Rewards
While Vision-Language Models (VLMs) are increasingly used to generate reward signals for training embodied agents to follow instructions, our research reveals that agents guided by VLM rewards often underperform compared to those employing only intrinsic (exploration-driven) rewards, contradicting expectations set by recent work. We hypothesize that false positive rewards -- instances where unintended trajectories are incorrectly rewarded -- are more detrimental than false negatives. Our analysis confirms this hypothesis, revealing that the widely used cosine similarity metric is prone to false positive reward estimates. To address this, we introduce BiMI ({Bi}nary {M}utual {I}nformation), a novel reward function designed to mitigate noise. BiMI significantly enhances learning efficiency across diverse and challenging embodied navigation environments. Our findings offer a nuanced understanding of how different types of reward noise impact agent learning and highlight the importance of addressing multimodal reward signal noise when training embodied agents
comment: 10 main body pages, 11 appendix pages
OA-MPC: Occlusion-Aware MPC for Guaranteed Safe Robot Navigation with Unseen Dynamic Obstacles
For safe navigation in dynamic uncertain environments, robotic systems rely on the perception and prediction of other agents. Particularly, in occluded areas where cameras and LiDAR give no data, the robot must be able to reason about potential movements of invisible dynamic agents. This work presents a provably safe motion planning scheme for real-time navigation in an a priori unmapped environment, where occluded dynamic agents are present. Safety guarantees are provided based on reachability analysis. Forward reachable sets associated with potential occluded agents, such as pedestrians, are computed and incorporated into planning. An iterative optimization-based planner is presented that alternates between two optimizations: nonlinear Model Predictive Control (NMPC) and collision avoidance. Recursive feasibility of the MPC is guaranteed by introducing a terminal stopping constraint. The effectiveness of the proposed algorithm is demonstrated through simulation studies and hardware experiments with a TurtleBot robot. A video of experimental results is available at \url{https://youtu.be/OUnkB5Feyuk}.
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Enabling robots to autonomously navigate unknown, complex, dynamic environments and perform diverse tasks remains a fundamental challenge in developing robust autonomous physical agents. These agents must effectively perceive their surroundings while leveraging world knowledge for decision-making. Although recent approaches utilize vision-language and large language models for scene understanding and planning, they often rely on offline processing, offboard compute, make simplifying assumptions about the environment and perception, limiting real-world applicability. We present a novel framework for real-time onboard autonomous navigation in unknown environments that change over time by integrating multi-level abstraction in both perception and planning pipelines. Our system fuses data from multiple onboard sensors for localization and mapping and integrates it with open-vocabulary semantics to generate hierarchical scene graphs from continuously updated semantic object map. The LLM-based planner uses these graphs to create multi-step plans that guide low-level controllers in executing navigation tasks specified in natural language. The system's real-time operation enables the LLM to adjust its plans based on updates to the scene graph and task execution status, ensuring continuous adaptation to new situations or when the current plan cannot accomplish the task, a key advantage over static or rule-based systems. We demonstrate our system's efficacy on a quadruped navigating dynamic environments, showcasing its adaptability and robustness in diverse scenarios.
Piecewise Stochastic Barrier Functions
This paper presents a novel stochastic barrier function (SBF) framework for safety analysis of stochastic systems based on piecewise (PW) functions. We first outline a general formulation of PW-SBFs. Then, we focus on PW-Constant (PWC) SBFs and show how their simplicity yields computational advantages for general stochastic systems. Specifically, we prove that synthesis of PWC-SBFs reduces to a minimax optimization problem. Then, we introduce three efficient algorithms to solve this problem, each offering distinct advantages and disadvantages. The first algorithm is based on dual linear programming (LP), which provides an exact solution to the minimax optimization problem. The second is a more scalable algorithm based on iterative counter-example guided synthesis, which involves solving two smaller LPs. The third algorithm solves the minimax problem using gradient descent, which admits even better scalability. We provide an extensive evaluation of these methods on various case studies, including neural network dynamic models, nonlinear switched systems, and high-dimensional linear systems. Our benchmarks demonstrate that PWC-SBFs outperform state-of-the-art methods, namely sum-of-squares and neural barrier functions, and can scale to eight dimensional systems.
Multiagent Systems
GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration
Graphs are widely used for modeling relational data in real-world scenarios, such as social networks and urban computing. Existing LLM-based graph analysis approaches either integrate graph neural networks (GNNs) for specific machine learning tasks, limiting their transferability, or rely solely on LLMs' internal reasoning ability, resulting in suboptimal performance. To address these limitations, we take advantage of recent advances in LLM-based agents, which have shown capabilities of utilizing external knowledge or tools for problem solving. By simulating human problem-solving strategies such as analogy and collaboration, we propose a multi-agent system based on LLMs named GraphTeam, for graph analysis. GraphTeam consists of five LLM-based agents from three modules, and the agents with different specialities can collaborate with each other to address complex problems. Specifically, (1) input-output normalization module: the question agent extracts and refines four key arguments from the original question, facilitating the problem understanding, and the answer agent organizes the results to meet the output requirement; (2) external knowledge retrieval module: we first build a knowledge base consisting of relevant documentation and experience information, and then the search agent retrieves the most relevant entries for each question. (3) problem-solving module: given the retrieved information from search agent, the coding agent uses established algorithms via programming to generate solutions, and in case the coding agent does not work, the reasoning agent will directly compute the results without programming. Extensive experiments on six graph analysis benchmarks demonstrate that GraphTeam achieves state-of-the-art performance with an average 25.85% improvement over the best baseline in terms of accuracy. The code and data are available at https://github.com/BUPT-GAMMA/GraphTeam.
Scalable Offline Reinforcement Learning for Mean Field Games AAMAS
Reinforcement learning algorithms for mean-field games offer a scalable framework for optimizing policies in large populations of interacting agents. Existing methods often depend on online interactions or access to system dynamics, limiting their practicality in real-world scenarios where such interactions are infeasible or difficult to model. In this paper, we present Offline Munchausen Mirror Descent (Off-MMD), a novel mean-field RL algorithm that approximates equilibrium policies in mean-field games using purely offline data. By leveraging iterative mirror descent and importance sampling techniques, Off-MMD estimates the mean-field distribution from static datasets without relying on simulation or environment dynamics. Additionally, we incorporate techniques from offline reinforcement learning to address common issues like Q-value overestimation, ensuring robust policy learning even with limited data coverage. Our algorithm scales to complex environments and demonstrates strong performance on benchmark tasks like crowd exploration or navigation, highlighting its applicability to real-world multi-agent systems where online experimentation is infeasible. We empirically demonstrate the robustness of Off-MMD to low-quality datasets and conduct experiments to investigate its sensitivity to hyperparameter choices.
comment: Submitted to AAMAS
TranSPORTmer: A Holistic Approach to Trajectory Understanding in Multi-Agent Sports ACCV 2024
Understanding trajectories in multi-agent scenarios requires addressing various tasks, including predicting future movements, imputing missing observations, inferring the status of unseen agents, and classifying different global states. Traditional data-driven approaches often handle these tasks separately with specialized models. We introduce TranSPORTmer, a unified transformer-based framework capable of addressing all these tasks, showcasing its application to the intricate dynamics of multi-agent sports scenarios like soccer and basketball. Using Set Attention Blocks, TranSPORTmer effectively captures temporal dynamics and social interactions in an equivariant manner. The model's tasks are guided by an input mask that conceals missing or yet-to-be-predicted observations. Additionally, we introduce a CLS extra agent to classify states along soccer trajectories, including passes, possessions, uncontrolled states, and out-of-play intervals, contributing to an enhancement in modeling trajectories. Evaluations on soccer and basketball datasets show that TranSPORTmer outperforms state-of-the-art task-specific models in player forecasting, player forecasting-imputation, ball inference, and ball imputation. https://youtu.be/8VtSRm8oGoE
comment: Accepted to ACCV 2024
Markov Potential Game with Final-time Reach-Avoid Objectives
We formulate a Markov potential game with final-time reach-avoid objectives by integrating potential game theory with stochastic reach-avoid control. Our focus is on multi-player trajectory planning where players maximize the same multi-player reach-avoid objective: the probability of all participants reaching their designated target states by a specified time, while avoiding collisions with one another. Existing approaches require centralized computation of actions via a global policy, which may have prohibitively expensive communication costs. Instead, we focus on approximations of the global policy via local state feedback policies. First, we adapt the recursive single player reach-avoid value iteration to the multi-player framework with local policies, and show that the same recursion holds on the joint state space. To find each player's optimal local policy, the multi-player reach-avoid value function is projected from the joint state to the local state using the other players' occupancy measures. Then, we propose an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium. We demonstrate the utility of our approach in finding collision-free policies for multi-player motion planning in simulation.
comment: 8 pages, 2 figures
Bridging Swarm Intelligence and Reinforcement Learning
Swarm intelligence (SI) explores how large groups of simple individuals (e.g., insects, fish, birds) collaborate to produce complex behaviors, exemplifying that the whole is greater than the sum of its parts. A fundamental task in SI is Collective Decision-Making (CDM), where a group selects the best option among several alternatives, such as choosing an optimal foraging site. In this work, we demonstrate a theoretical and empirical equivalence between CDM and single-agent reinforcement learning (RL) in multi-armed bandit problems, utilizing concepts from opinion dynamics, evolutionary game theory, and RL. This equivalence bridges the gap between SI and RL and leads us to introduce a novel abstract RL update rule called Maynard-Cross Learning. Additionally, it provides a new population-based perspective on common RL practices like learning rate adjustment and batching. Our findings enable cross-disciplinary fertilization between RL and SI, allowing techniques from one field to enhance the understanding and methodologies of the other.
IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems
As large language model (LLM) agents increasingly integrate into our infrastructure, their robust coordination and message synchronization become vital. The Byzantine Generals Problem (BGP) is a critical model for constructing resilient multi-agent systems (MAS) under adversarial attacks. It describes a scenario where malicious agents with unknown identities exist in the system-situations that, in our context, could result from LLM agents' hallucinations or external attacks. In BGP, the objective of the entire system is to reach a consensus on the action to be taken. Traditional BGP requires global consensus among all agents; however, in practical scenarios, global consensus is not always necessary and can even be inefficient. Therefore, there is a pressing need to explore a refined version of BGP that aligns with the local coordination patterns observed in MAS. We refer to this refined version as Imperfect BGP (IBGP) in our research, aiming to address this discrepancy. To tackle this issue, we propose a framework that leverages consensus protocols within general MAS settings, providing provable resilience against communication attacks and adaptability to changing environments, as validated by empirical results. Additionally, we present a case study in a sensor network environment to illustrate the practical application of our protocol.
Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search
Reinforcement learning has achieved remarkable success in perfect information games such as Go and Atari, enabling agents to compete at the highest levels against human players. However, research in reinforcement learning for imperfect information games has been relatively limited due to the more complex game structures and randomness. Traditional methods face challenges in training and improving performance in imperfect information games due to issues like inaccurate Q value estimation and reward sparsity. In this paper, we focus on Uno, an imperfect information game, and aim to address these problems by reducing Q value overestimation and reshaping reward function. We propose a novel algorithm that utilizes Monte Carlo Tree Search to average the value estimations in Q function. Even though we choose Double Deep Q Learning as the foundational framework in this paper, our method can be generalized and used in any algorithm which needs Q value estimation, such as the Actor-Critic. Additionally, we employ Monte Carlo Tree Search to reshape the reward structure in the game environment. We compare our algorithm with several traditional methods applied to games such as Double Deep Q Learning, Deep Monte Carlo and Neural Fictitious Self Play, and the experiments demonstrate that our algorithm consistently outperforms these approaches, especially as the number of players in Uno increases, indicating a higher level of difficulty.
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
Diplomacy is one of the most sophisticated activities in human society, involving complex interactions among multiple parties that require skills in social reasoning, negotiation, and long-term strategic planning. Previous AI agents have demonstrated their ability to handle multi-step games and large action spaces in multi-agent tasks. However, diplomacy involves a staggering magnitude of decision spaces, especially considering the negotiation stage required. While recent agents based on large language models (LLMs) have shown potential in various applications, they still struggle with extended planning periods in complex multi-agent settings. Leveraging recent technologies for LLM-based agents, we aim to explore AI's potential to create a human-like agent capable of executing comprehensive multi-agent missions by integrating three fundamental capabilities: 1) strategic planning with memory and reflection; 2) goal-oriented negotiation with social reasoning; and 3) augmenting memory through self-play games for self-evolution without human in the loop.
Bearing-Distance Based Flocking with Zone-Based Interactions
This paper presents a novel zone-based flocking control approach suitable for dynamic multi-agent systems (MAS). Inspired by Reynolds behavioral rules for $boids$, flocking behavioral rules with the zones of repulsion, conflict, attraction, and surveillance are introduced. For each agent, using only bearing and distance measurements, behavioral deviation vectors quantify the deviations from the local separation, local and global flock velocity alignment, local cohesion, obstacle avoidance and boundary conditions, and strategic separation for avoiding alien agents. The control strategy uses the local perception-based behavioral deviation vectors to guide each agent's motion. Additionally, the control strategy incorporates a directionally-aware obstacle avoidance mechanism that prioritizes obstacles in the agent's forward path. Simulation results validate the effectiveness of this approach in creating flexible, adaptable, and scalable flocking behavior.
On the limits of agency in agent-based models
Agent-based modeling (ABM) seeks to understand the behavior of complex systems by simulating a collection of agents that act and interact within an environment. Their practical utility requires capturing realistic environment dynamics and adaptive agent behavior while efficiently simulating million-size populations. Recent advancements in large language models (LLMs) present an opportunity to enhance ABMs by using LLMs as agents with further potential to capture adaptive behavior. However, the computational infeasibility of using LLMs for large populations has hindered their widespread adoption. In this paper, we introduce AgentTorch -- a framework that scales ABMs to millions of agents while capturing high-resolution agent behavior using LLMs. We benchmark the utility of LLMs as ABM agents, exploring the trade-off between simulation scale and individual agency. Using the COVID-19 pandemic as a case study, we demonstrate how AgentTorch can simulate 8.4 million agents representing New York City, capturing the impact of isolation and employment behavior on health and economic outcomes. We compare the performance of different agent architectures based on heuristic and LLM agents in predicting disease waves and unemployment rates. Furthermore, we showcase AgentTorch's capabilities for retrospective, counterfactual, and prospective analyses, highlighting how adaptive agent behavior can help overcome the limitations of historical data in policy design. AgentTorch is an open-source project actively being used for policy-making and scientific discovery around the world. The framework is available here: github.com/AgentTorch/AgentTorch.
comment: 19 pages, 5 appendices, 5 figures
Systems and Control (CS)
Effective Finite Time Stability Control for Human-Machine Shared Vehicle Following System
With the development of intelligent connected vehicle technology, human-machine shared control has gained popularity in vehicle following due to its effectiveness in driver assistance. However, traditional vehicle following systems struggle to maintain stability when driver reaction time fluctuates, as these variations require different levels of system intervention. To address this issue, the proposed human-machine shared vehicle following assistance system (HM-VFAS) integrates driver outputs under various states with the assistance system. The system employs an intelligent driver model that accounts for reaction time delays, simulating time-varying driver outputs. A control authority allocation strategy is designed to dynamically adjust the level of intervention based on real-time driver state assessment. To handle instability from driver authority switching, the proposed solution includes a two-layer adaptive finite time sliding mode controller (A-FTSMC). The first layer is an integral sliding mode adaptive controller that ensures robustness by compensating for uncertainties in the driver output. The second layer is a fast non-singular terminal sliding mode controller designed to accelerate convergence for rapid stabilization. Using real driver videos as inputs, the performance of the HM-VFAS was evaluated. Results show that the proposed control strategy maintains a safe distance under time-varying driver states, with the actual acceleration error relative to the target acceleration maintained within 0.5m/s~2 and the maximum acceleration error reduced by 1.2m/s~2. Compared to traditional controllers, the A-FTSMC controller offers faster convergence and less vibration, reducing the stabilization time by 27.3%.
Role of hydrogen in decarbonizing China's electricity and hard-to-abate sectors
Green hydrogen has the potential to address two pressing problems in a zero-carbon energy system: balancing seasonal variability of solar and wind in the electricity sector, and replacing fossil fuels in hard-to-abate sectors. However, the previous research only separately modeled the electricity and hard-to-abate sectors, which is unable to capture how the interaction between the two sectors influences the energy system cost. In this study, focusing on China, we deploy an electricity system planning model to examine the cost implications of green hydrogen to fully decarbonize the electricity system and hard-to-abate sectors. Our results reveal that green hydrogen enables a 17% reduction in the levelized cost of a zero-carbon electricity system relative to that without hydrogen. However, cost savings hinge on the availability of underground hydrogen storage capacities and electric transmission expansion. More importantly, coupling hydrogen infrastructure in the electricity and hard-to-abate sectors not only reduces energy costs compared to a decoupled energy system but also makes green hydrogen cost-competitive compared to fossil fuel-based gray and blue hydrogen in China.
comment: 4 figures, 33 pages
Reconfigurable Hydrostatics: Toward Multifunctional and Powerful Wearable Robotics
Wearable and locomotive robot designers face multiple challenges when choosing actuation. Traditional fully actuated designs using electric motors are multifunctional but oversized and inefficient for bearing conservative loads and for being backdrivable. Alternatively, quasi-passive and underactuated designs reduce the size of motorization and energy storage, but are often designed for specific tasks. Designers of versatile and stronger wearable robots will face these challenges unless future actuators become very torque-dense, backdrivable and efficient. This paper explores a design paradigm for addressing this issue: reconfigurable hydrostatics. We show that a hydrostatic actuator can integrate a passive force mechanism and a sharing mechanism in the fluid domain and still be multifunctional. First, an analytical study compares how these two mechanisms can relax the motorization requirements in the context of a load-bearing exoskeleton. Then, the hydrostatic concept integrating these two mechanisms using hydraulic components is presented. A case study analysis shows the mass/efficiency/inertia benefits of the concept over a fully actuated one. Then, the feasibility of the concept is partially validated with a proof-of-concept that actuates the knees of an exoskeleton. The experiments show that it can track the vertical ground reaction force (GRF) profiles of walking, running, squatting, and jumping, and that the energy consumption is 6x lower. The transient force behaviors due to switching from one leg to the other are also analyzed along with some mitigation to improve them.
Identifiable Representation and Model Learning for Latent Dynamic Systems
Learning identifiable representations and models from low-level observations is useful for an intelligent spacecraft to reliability finish downstream tasks. For temporal observations, to ensure that the data generating process is provably inverted, most existing works either assume the noise variables in the dynamic mechanisms are (conditionally) independent, or require interventions which can directly affect each latent variable. However, in practice, the relationship between the exogenous inputs/interventions and the latent variables may follow some complex deterministic mechanisms. In this work, we study the problem of identifiable representation and model learning for latent dynamic systems. The key idea is that we use an inductive bias inspired by controllable canonical forms, which is invariant, sparse, and input dependent by definition. We prove that, for linear or affine nonlinear latent dynamic systems, it is possible to identify the representations up to scaling and determine the models up to some simple transformations. The results have potential to provide some theoretical guarantees for developing more trustworthy decision-making and control methods for intelligent spacecrafts.
An iteration-free approach to excitation harmonization
Sinusoidal excitation is particularly popular for testing structures in the nonlinear regime. Due to the nonlinear behavior and the inevitable feedback of the structure on the exciter, higher harmonics in the applied excitation are generated. This is undesired, because the acquired response may deviate substantially from that of the structure under purely sinusoidal excitation, in particular if one of the higher harmonics engages into resonance. We present a new approach to suppress those higher excitation harmonics and thus the unwanted exciter-structure interaction: Higher harmonics are added to the voltage input to the shaker whose Fourier coefficients are adjusted via feedback control until the excitation is purely sinusoidal. The stability of this method is analyzed for a simplified model; the resulting closed-form expressions are useful, among others, to select an appropriate exciter configuration, including the drive point. A practical procedure for the control design is suggested. The proposed method is validated in virtual and real experiments of internally resonant structures, in the two common configurations of force excitation via a stinger and base excitation. Excellent performance is achieved already when using the same control gains for all harmonics, throughout the tested range of amplitudes and frequencies, even in the strongly nonlinear regime. Compared to the iterative state of the art, it is found that the proposed method is simpler to implement, enables faster testing and it is easy to achieve a lower harmonic distortion.
e-Values for Real-Time Residential Electricity Demand Forecast Model Selection
With the growing number of forecasting techniques and the increasing significance of forecast-based operation - particularly in the rapidly evolving energy sector - selecting the most effective forecasting model has become a critical task. Given the dynamic nature of energy forecasting, it is highly advantageous to assess the superiority of forecasting models not only retrospectively but continuously in real-time as new data and evidence becomes available, while simultaneously providing strong probabilistic guarantees for these decisions. In this work, we show that this can be achieved through the mathematical concept of e-values, which has recently gained massive attention in the field of statistics. It allows for unified construction principles for powerful tests and accurate statistical decisions, which can be evaluated at any chosen time points while maintaining an overall probabilistic error control. We extend the use of e-values by developing a simple persistence approach that dynamically combines input forecasts to generate new fused predictions. To demonstrate the performance of our method we apply it to electricity demand forecasts based on different artificial intelligence based models. Our results indicate that e-values are able to improve the accuracy and reliability of forecasts in a dynamic environment, offering a valuable tool for real-time decision-making in the energy sector.
comment: 25 pages, 9 figures
Fiber Activation by Bipolar Stimulation in Deep Brain Stimulation: A Patient Case Study
Deep Brain Stimulation (DBS) is a therapy widely used for treating the symptoms of neurological disorders. Electrical pulses are chronically delivered in DBS to a disease-specific brain target via a surgically implanted electrode. The stimulating contact configuration, stimulation polarity, as well as amplitude, frequency, and pulse width of the DBS pulse sequence are utilized to optimize the therapeutic effect. In this paper, the utility of therapy individualization by means of patient-specific mathematical modeling is investigated with respect to a specific case of a patient diagnosed with Essential Tremor (ET). Two computational models are compared in their ability to elucidate the impact of DBS stimulation on the dentato-rubrothalamic tract: (i) a conventional model of Volume of Tissue Activated (VTA) and (ii) a well-established neural fiber activation modeling framework known as OSS-DBS. The simulation results are compared with tremor measured in the patient under different DBS settings using a smartphone application. The findings of the study highlight that temporally static VTA models do not adequately describe the differences in the outcomes of bipolar stimulation settings with switched polarity, whereas neural fiber activation models hold potential in this regard. However, it is noted that neither of the investigated models fully accounts for the measured symptom pattern, particularly regarding a bilateral effect produced by unilateral stimulation.
comment: 7 pages, 5 figures, 5 tables
Time-to-Lie: Identifying Industrial Control System Honeypots Using the Internet Control Message Protocol
The convergence of information and operational technology networks has created previously unforeseen security issues. To address these issues, both researchers and practitioners have integrated threat intelligence methods into the security operations of converged networks, with some of the most valuable tools being honeypots that imitate industrial control systems (ICS). However, the development and deployment of such honeypots is a process rich with pitfalls, which can lead to undiagnosed weaknesses in the threat intelligence being gathered. This paper presents a side-channel method of covertly identifying ICS honeypots using the time-to-live (TTL) values of target devices. We show that many ICS honeypots can be readily identified, via minimal interactions, using only basic networking tools. In a study of over 8,000 devices presenting as ICS systems, we detail how our method compares to an existing honeypot detection approach, and outline what our methodology reveals about the current population of live ICS honeypots. In demonstrating our method, this study aims to raise awareness of the viability of the TTL heuristic and the prevalence of its misconfiguration despite its presence in literature.
comment: 11 pages, 2 listings, 5 tables, 6 figures
Deoxys: A Causal Inference Engine for Unhealthy Node Mitigation in Large-scale Cloud Infrastructure
The presence of unhealthy nodes in cloud infrastructure signals the potential failure of machines, which can significantly impact the availability and reliability of cloud services, resulting in negative customer experiences. Effectively addressing unhealthy node mitigation is therefore vital for sustaining cloud system performance. This paper introduces Deoxys, a causal inference engine tailored to recommending mitigation actions for unhealthy node in cloud systems to minimize virtual machine downtime and interruptions during unhealthy events. It employs double machine learning combined with causal forest to produce precise and reliable mitigation recommendations based solely on limited observational data collected from the historical unhealthy events. To enhance the causal inference model, Deoxys further incorporates a policy fallback mechanism based on model uncertainty and action overriding mechanisms to (i) improve the reliability of the system, and (ii) strike a good tradeoff between downtime reduction and resource utilization, thereby enhancing the overall system performance. After deploying Deoxys in a large-scale cloud infrastructure at Microsoft, our observations demonstrate that Deoxys significantly reduces average VM downtime by 53% compared to a legacy policy, while leading to 49.5% lower VM interruption rate. This substantial improvement enhances the reliability and stability of cloud platforms, resulting in a seamless customer experience.
Markov Potential Game with Final-time Reach-Avoid Objectives
We formulate a Markov potential game with final-time reach-avoid objectives by integrating potential game theory with stochastic reach-avoid control. Our focus is on multi-player trajectory planning where players maximize the same multi-player reach-avoid objective: the probability of all participants reaching their designated target states by a specified time, while avoiding collisions with one another. Existing approaches require centralized computation of actions via a global policy, which may have prohibitively expensive communication costs. Instead, we focus on approximations of the global policy via local state feedback policies. First, we adapt the recursive single player reach-avoid value iteration to the multi-player framework with local policies, and show that the same recursion holds on the joint state space. To find each player's optimal local policy, the multi-player reach-avoid value function is projected from the joint state to the local state using the other players' occupancy measures. Then, we propose an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium. We demonstrate the utility of our approach in finding collision-free policies for multi-player motion planning in simulation.
comment: 8 pages, 2 figures
Accelerating soft-constrained MPC for linear systems through online constraint removal
Optimization-based controllers, such as Model Predictive Control (MPC), have attracted significant research interest due to their intuitive concept, constraint handling capabilities, and natural application to multi-input multi-output systems. However, the computational complexity of solving a receding horizon problem at each time step remains a challenge for the deployment of MPC. This is particularly the case for systems constrained by many inequalities. Recently, we introduced the concept of constraint-adaptive MPC (ca-MPC) to address this challenge for linear systems with hard constraints. In ca-MPC, at each time step, a subset of the constraints is removed from the optimization problem, thereby accelerating the optimization procedure, while resulting in identical closed-loop behavior. The present paper extends this framework to soft-constrained MPC by detecting and removing constraints based on sub-optimal predicted input sequences, which is rather easy for soft-constrained MPC due to the receding horizon principle and the inclusion of slack variables. We will translate these new ideas explicitly to an offset-free output tracking problem. The effectiveness of these ideas is demonstrated on a two-dimensional thermal transport model, showing a three order of magnitude improvement in online computational time of the MPC scheme.
comment: 6 pages, 5 figures, CDC 2023 conference
Approximate Kalman filtering for large-scale systems with an application to hyperthermia cancer treatments
Accurate state estimates are required for increasingly complex systems, to enable, for example, feedback control. However, available state estimation schemes are not necessarily real-time feasible for certain large-scale systems. Therefore, we develop in this paper, a real-time feasible state-estimation scheme for a class of large-scale systems that approximates the steady state Kalman filter. In particular, we focus on systems where the state-vector is the result of discretizing the spatial domain, as typically seen in Partial Differential Equations. In such cases, the correlation between states in the state-vector often have an intuitive interpretation on the spatial domain, which can be exploited to obtain a significant reduction in computational complexity, while still providing accurate state estimates. We illustrate these strengths of our method through a hyperthermia cancer treatment case study. The results of the case study show significant improvements in the computation time, while simultaneously obtaining good state estimates, when compared to Ensemble Kalman filters and Kalman filters using reduced-order models.
comment: 6 pages, 6 figures, CDC 2022 conference
Constraint Removal for MPC with Performance Preservation and a Hyperthermia Cancer Treatment Case Study
Model predictive control (MPC) is an optimization-based control strategy with broad industrial adoption. Unfortunately, the required computation time to solve the receding-horizon MPC optimization problem can become prohibitively large for many applications with a large number of state constraints. This large number of state constraints can, for instance, originate from spatially discretizing a partial differential equation of which the solution has to satisfy constraints over the full spatial domain. This is particularly the case in MPC for RF-based hyperthermia cancer treatments, which forms a strong motivation for this study. To address this problem, we propose a novel constraint-adaptive MPC framework for linear discrete-time systems. In this framework, we select at each time-step a subset of the state constraints that are included in the optimization problem, thereby reducing the online computational burden. Critically, our framework guarantees the same closed-loop performance, recursive feasibility, and constraint satisfaction properties as the original (non-reduced) MPC scheme. We achieve this result by efficiently exploiting reachable set computations and the MPC cost function. We will demonstrate our novel method using a hyperthermia cancer treatment case study showing a two-orders of magnitude improvement in computation time, with identical closed-loop performance as the original (non-reduced) MPC scheme.
comment: 6 pages, 3 figures, CDC 2021 conference
Exploiting Data Centres and Local Energy Communities Synergies for Market Participation
The evolving energy landscape has propelled energy communities to the forefront of modern energy management. However, existing research has yet to explore the potential synergies between data centres and energy communities, necessitating an assessment on their collective capabilities for cost efficiency, waste heat optimisation, and market participation. This paper presents a mixed integer linear programming model to assess the collaborative performance of energy communities, data centres and energy markets. The evaluation focuses on the efficient use of waste heat and the flexibility of job scheduling while minimising system energy costs and maintaining quality of service requirements for data centres. Our results, based on realistic profiles of an energy community and a data centre, showcase significant benefits of these synergies, with a 38% reduction in operating costs and an 87% decrease in heat demand.
comment: Submitted to IEEE PES ISGT Europe 2024
Risk-sensitive Affine Control Synthesis for Stationary LTI Systems
To address deviations from expected performance in stochastic systems, we propose a risk-sensitive control synthesis method to minimize certain risk measures over the limiting stationary distribution. Specifically, we extend Worst-case Conditional Value-at-Risk (W-CVaR) optimization for Linear Time-invariant (LTI) systems to handle nonzero-mean noise and affine controllers, using only the first and second moments of noise, which enhances robustness against model uncertainty. Highlighting the strong coupling between the linear and bias terms of the controller, we reformulate the synthesis problem as a Bilinear Matrix Inequality (BMI), and propose an alternating optimization algorithm with guaranteed convergence. Finally, we demonstrate the numerical performance of our approach in two representative settings, which shows that the proposed algorithm successfully synthesizes risk-sensitive controllers that outperform the na\"ive LQR baseline.
comment: 8 pages, 4 figures, 2 illustrations
Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System through Distributed Database and Multimodal Perception: Demonstrated in Crossroads
The autonomous driving industry is rapidly advancing, with Vehicle-to-Vehicle (V2V) communication systems highlighting as a key component of enhanced road safety and traffic efficiency. This paper introduces a novel Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System (VVCCS), designed to revolutionize macro-scope traffic planning and collision avoidance in autonomous driving. Implemented on Quanser Car (Qcar) hardware platform, our system integrates the distributed databases into individual autonomous vehicles and an optional central server. We also developed a comprehensive multi-modal perception system with multi-objective tracking and radar sensing. Through a demonstration within a physical crossroad environment, our system showcases its potential to be applied in congested and complex urban environments.
comment: ICICT 2024, 18 pages
LEADS: Lightweight Embedded Assisted Driving System
With the rapid development of electric vehicles, formula races that face high school and university students have become more popular than ever as the threshold for design and manufacturing has been lowered. In many cases, we see teams inspired by or directly using toolkits and technologies inherited from standardized commercial vehicles. These architectures are usually overly complicated for amateur applications like the races. In order to improve the efficiency and simplify the development of instrumentation, control, and analysis systems, we propose LEADS (Lightweight Embedded Assisted Driving System), a dedicated solution for such scenarios.
comment: 12 pages
Electric Grid Topology and Admittance Estimation: Quantifying Phasor-based Measurement Requirements
In this paper, we quantify voltage and current phasor-based measurement requirements for the unique identification of the electric grid topology and admittance parameters. Our approach is underpinned by the concept of a rigidity matrix that has been extensively studied in graph rigidity theory. Specifically, we show that the rank of the rigidity matrix is the same as that of a voltage coefficient matrix in a corresponding electric power system. Accordingly, we show that there is a minimum number of measurements required to uniquely identify the admittance matrix and corresponding grid topology. By means of a numerical example on the IEEE 4-node radial network, we demonstrate that our approach is suitable for applications in electric power grids.
SPARC: Prediction-Based Safe Control for Coupled Controllable and Uncontrollable Agents with Conformal Predictions
We investigate the problem of safe control synthesis for systems operating in environments with uncontrollable agents whose dynamics are unknown but coupled with those of the controlled system. This scenario naturally arises in various applications, such as autonomous driving and human-robot collaboration, where the behavior of uncontrollable agents, like pedestrians, cannot be directly controlled but is influenced by the actions of the autonomous vehicle or robot. In this paper, we present SPARC (Safe Prediction-Based Robust Controller for Coupled Agents), a novel framework designed to ensure safe control in the presence of coupled uncontrollable agents. SPARC leverages conformal prediction to quantify uncertainty in data-driven prediction of agent behavior. Particularly, we introduce a joint distribution-based approach to account for the coupled dynamics of the controlled system and uncontrollable agents. By integrating the control barrier function (CBF) technique, SPARC provides provable safety guarantees at a high confidence level. We illustrate our framework with a case study involving an autonomous driving scenario with walking pedestrians.
comment: It's not complete yet
Addressing Trust Issues for Vehicle to Grid in Distributed Power Grids Using Blockchains
While blockchain offers inherent security, trust issues among stakeholders in vehicle-to-grid (V2G) applications remain unresolved due to a lack of regulatory frameworks and standardization. Additionally, a tailored decentralized privacy-preserved coordination scheme for blockchain in V2G networks is needed to ensure user privacy and efficient energy transactions. This paper proposes a V2G trading and coordination scheme tailored to the decentralized nature of blockchain as well as the interests of stakeholders utilizing smart charging points (SCPs) and Stackelberg game model. Case studies using real-world data from Southern University of Science and Technology demonstrate the efficacy of proposed scheme in reducing EV charging costs and the potential for supporting auxiliary grid services.
comment: This paper has been accepted by The 14th International Conference on Power and Energy Systems (ICPES 2024)
Certifiably Robust Policies for Uncertain Parametric Environments
We present a data-driven approach for producing policies that are provably robust across unknown stochastic environments. Existing approaches can learn models of a single environment as an interval Markov decision processes (IMDP) and produce a robust policy with a probably approximately correct (PAC) guarantee on its performance. However these are unable to reason about the impact of environmental parameters underlying the uncertainty. We propose a framework based on parametric Markov decision processes (MDPs) with unknown distributions over parameters. We learn and analyse IMDPs for a set of unknown sample environments induced by parameters. The key challenge is then to produce meaningful performance guarantees that combine the two layers of uncertainty: (1) multiple environments induced by parameters with an unknown distribution; (2) unknown induced environments which are approximated by IMDPs. We present a novel approach based on scenario optimisation that yields a single PAC guarantee quantifying the risk level for which a specified performance level can be assured in unseen environments, plus a means to trade-off risk and performance. We implement and evaluate our framework using multiple robust policy generation methods on a range of benchmarks. We show that our approach produces tight bounds on a policy's performance with high confidence.
Posterior Sampling-based Online Learning for Episodic POMDPs
Learning in POMDPs is known to be significantly harder than in MDPs. In this paper, we consider the online learning problem for episodic POMDPs with unknown transition and observation models. We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms for POMDPs. We show that the Bayesian regret of the proposed algorithm scales as the square root of the number of episodes and is polynomial in the other parameters. In a general setting, the regret scales exponentially in the horizon length $H$, and we show that this is inevitable by providing a lower bound. However, when the POMDP is undercomplete and weakly revealing (a common assumption in the recent literature), we establish a polynomial Bayesian regret bound. We finally propose a posterior sampling algorithm for multi-agent POMDPs, and show it too has sublinear regret.
comment: 41 pages, 9 figures
Mitigating Information Asymmetry in Two-Stage Contracts with Non-Myopic Agents
We consider a Stackelberg game in which a principal (she) establishes a two-stage contract with a non-myopic agent (he) whose type is unknown. The contract takes the form of an incentive function mapping the agent's first-stage action to his second-stage incentive. While the first-stage action reveals the agent's type under truthful play, a non-myopic agent could benefit from portraying a false type in the first stage to obtain a larger incentive in the second stage. The challenge is thus for the principal to design the incentive function so as to induce truthful play. We show that this is only possible with a constant, non-reactive incentive functions when the type space is continuous, whereas it can be achieved with reactive functions for discrete types. Additionally, we show that introducing an adjustment mechanism that penalizes inconsistent behavior across both stages allows the principal to design more flexible incentive functions.
comment: To appear in the Proceedings of the 5th IFAC Workshop on Cyber-Physical Human Systems
Systems and Control (EESS)
Effective Finite Time Stability Control for Human-Machine Shared Vehicle Following System
With the development of intelligent connected vehicle technology, human-machine shared control has gained popularity in vehicle following due to its effectiveness in driver assistance. However, traditional vehicle following systems struggle to maintain stability when driver reaction time fluctuates, as these variations require different levels of system intervention. To address this issue, the proposed human-machine shared vehicle following assistance system (HM-VFAS) integrates driver outputs under various states with the assistance system. The system employs an intelligent driver model that accounts for reaction time delays, simulating time-varying driver outputs. A control authority allocation strategy is designed to dynamically adjust the level of intervention based on real-time driver state assessment. To handle instability from driver authority switching, the proposed solution includes a two-layer adaptive finite time sliding mode controller (A-FTSMC). The first layer is an integral sliding mode adaptive controller that ensures robustness by compensating for uncertainties in the driver output. The second layer is a fast non-singular terminal sliding mode controller designed to accelerate convergence for rapid stabilization. Using real driver videos as inputs, the performance of the HM-VFAS was evaluated. Results show that the proposed control strategy maintains a safe distance under time-varying driver states, with the actual acceleration error relative to the target acceleration maintained within 0.5m/s~2 and the maximum acceleration error reduced by 1.2m/s~2. Compared to traditional controllers, the A-FTSMC controller offers faster convergence and less vibration, reducing the stabilization time by 27.3%.
Role of hydrogen in decarbonizing China's electricity and hard-to-abate sectors
Green hydrogen has the potential to address two pressing problems in a zero-carbon energy system: balancing seasonal variability of solar and wind in the electricity sector, and replacing fossil fuels in hard-to-abate sectors. However, the previous research only separately modeled the electricity and hard-to-abate sectors, which is unable to capture how the interaction between the two sectors influences the energy system cost. In this study, focusing on China, we deploy an electricity system planning model to examine the cost implications of green hydrogen to fully decarbonize the electricity system and hard-to-abate sectors. Our results reveal that green hydrogen enables a 17% reduction in the levelized cost of a zero-carbon electricity system relative to that without hydrogen. However, cost savings hinge on the availability of underground hydrogen storage capacities and electric transmission expansion. More importantly, coupling hydrogen infrastructure in the electricity and hard-to-abate sectors not only reduces energy costs compared to a decoupled energy system but also makes green hydrogen cost-competitive compared to fossil fuel-based gray and blue hydrogen in China.
comment: 4 figures, 33 pages
Reconfigurable Hydrostatics: Toward Multifunctional and Powerful Wearable Robotics
Wearable and locomotive robot designers face multiple challenges when choosing actuation. Traditional fully actuated designs using electric motors are multifunctional but oversized and inefficient for bearing conservative loads and for being backdrivable. Alternatively, quasi-passive and underactuated designs reduce the size of motorization and energy storage, but are often designed for specific tasks. Designers of versatile and stronger wearable robots will face these challenges unless future actuators become very torque-dense, backdrivable and efficient. This paper explores a design paradigm for addressing this issue: reconfigurable hydrostatics. We show that a hydrostatic actuator can integrate a passive force mechanism and a sharing mechanism in the fluid domain and still be multifunctional. First, an analytical study compares how these two mechanisms can relax the motorization requirements in the context of a load-bearing exoskeleton. Then, the hydrostatic concept integrating these two mechanisms using hydraulic components is presented. A case study analysis shows the mass/efficiency/inertia benefits of the concept over a fully actuated one. Then, the feasibility of the concept is partially validated with a proof-of-concept that actuates the knees of an exoskeleton. The experiments show that it can track the vertical ground reaction force (GRF) profiles of walking, running, squatting, and jumping, and that the energy consumption is 6x lower. The transient force behaviors due to switching from one leg to the other are also analyzed along with some mitigation to improve them.
Identifiable Representation and Model Learning for Latent Dynamic Systems
Learning identifiable representations and models from low-level observations is useful for an intelligent spacecraft to reliability finish downstream tasks. For temporal observations, to ensure that the data generating process is provably inverted, most existing works either assume the noise variables in the dynamic mechanisms are (conditionally) independent, or require interventions which can directly affect each latent variable. However, in practice, the relationship between the exogenous inputs/interventions and the latent variables may follow some complex deterministic mechanisms. In this work, we study the problem of identifiable representation and model learning for latent dynamic systems. The key idea is that we use an inductive bias inspired by controllable canonical forms, which is invariant, sparse, and input dependent by definition. We prove that, for linear or affine nonlinear latent dynamic systems, it is possible to identify the representations up to scaling and determine the models up to some simple transformations. The results have potential to provide some theoretical guarantees for developing more trustworthy decision-making and control methods for intelligent spacecrafts.
An iteration-free approach to excitation harmonization
Sinusoidal excitation is particularly popular for testing structures in the nonlinear regime. Due to the nonlinear behavior and the inevitable feedback of the structure on the exciter, higher harmonics in the applied excitation are generated. This is undesired, because the acquired response may deviate substantially from that of the structure under purely sinusoidal excitation, in particular if one of the higher harmonics engages into resonance. We present a new approach to suppress those higher excitation harmonics and thus the unwanted exciter-structure interaction: Higher harmonics are added to the voltage input to the shaker whose Fourier coefficients are adjusted via feedback control until the excitation is purely sinusoidal. The stability of this method is analyzed for a simplified model; the resulting closed-form expressions are useful, among others, to select an appropriate exciter configuration, including the drive point. A practical procedure for the control design is suggested. The proposed method is validated in virtual and real experiments of internally resonant structures, in the two common configurations of force excitation via a stinger and base excitation. Excellent performance is achieved already when using the same control gains for all harmonics, throughout the tested range of amplitudes and frequencies, even in the strongly nonlinear regime. Compared to the iterative state of the art, it is found that the proposed method is simpler to implement, enables faster testing and it is easy to achieve a lower harmonic distortion.
e-Values for Real-Time Residential Electricity Demand Forecast Model Selection
With the growing number of forecasting techniques and the increasing significance of forecast-based operation - particularly in the rapidly evolving energy sector - selecting the most effective forecasting model has become a critical task. Given the dynamic nature of energy forecasting, it is highly advantageous to assess the superiority of forecasting models not only retrospectively but continuously in real-time as new data and evidence becomes available, while simultaneously providing strong probabilistic guarantees for these decisions. In this work, we show that this can be achieved through the mathematical concept of e-values, which has recently gained massive attention in the field of statistics. It allows for unified construction principles for powerful tests and accurate statistical decisions, which can be evaluated at any chosen time points while maintaining an overall probabilistic error control. We extend the use of e-values by developing a simple persistence approach that dynamically combines input forecasts to generate new fused predictions. To demonstrate the performance of our method we apply it to electricity demand forecasts based on different artificial intelligence based models. Our results indicate that e-values are able to improve the accuracy and reliability of forecasts in a dynamic environment, offering a valuable tool for real-time decision-making in the energy sector.
comment: 25 pages, 9 figures
Fiber Activation by Bipolar Stimulation in Deep Brain Stimulation: A Patient Case Study
Deep Brain Stimulation (DBS) is a therapy widely used for treating the symptoms of neurological disorders. Electrical pulses are chronically delivered in DBS to a disease-specific brain target via a surgically implanted electrode. The stimulating contact configuration, stimulation polarity, as well as amplitude, frequency, and pulse width of the DBS pulse sequence are utilized to optimize the therapeutic effect. In this paper, the utility of therapy individualization by means of patient-specific mathematical modeling is investigated with respect to a specific case of a patient diagnosed with Essential Tremor (ET). Two computational models are compared in their ability to elucidate the impact of DBS stimulation on the dentato-rubrothalamic tract: (i) a conventional model of Volume of Tissue Activated (VTA) and (ii) a well-established neural fiber activation modeling framework known as OSS-DBS. The simulation results are compared with tremor measured in the patient under different DBS settings using a smartphone application. The findings of the study highlight that temporally static VTA models do not adequately describe the differences in the outcomes of bipolar stimulation settings with switched polarity, whereas neural fiber activation models hold potential in this regard. However, it is noted that neither of the investigated models fully accounts for the measured symptom pattern, particularly regarding a bilateral effect produced by unilateral stimulation.
comment: 7 pages, 5 figures, 5 tables
Time-to-Lie: Identifying Industrial Control System Honeypots Using the Internet Control Message Protocol
The convergence of information and operational technology networks has created previously unforeseen security issues. To address these issues, both researchers and practitioners have integrated threat intelligence methods into the security operations of converged networks, with some of the most valuable tools being honeypots that imitate industrial control systems (ICS). However, the development and deployment of such honeypots is a process rich with pitfalls, which can lead to undiagnosed weaknesses in the threat intelligence being gathered. This paper presents a side-channel method of covertly identifying ICS honeypots using the time-to-live (TTL) values of target devices. We show that many ICS honeypots can be readily identified, via minimal interactions, using only basic networking tools. In a study of over 8,000 devices presenting as ICS systems, we detail how our method compares to an existing honeypot detection approach, and outline what our methodology reveals about the current population of live ICS honeypots. In demonstrating our method, this study aims to raise awareness of the viability of the TTL heuristic and the prevalence of its misconfiguration despite its presence in literature.
comment: 11 pages, 2 listings, 5 tables, 6 figures
Deoxys: A Causal Inference Engine for Unhealthy Node Mitigation in Large-scale Cloud Infrastructure
The presence of unhealthy nodes in cloud infrastructure signals the potential failure of machines, which can significantly impact the availability and reliability of cloud services, resulting in negative customer experiences. Effectively addressing unhealthy node mitigation is therefore vital for sustaining cloud system performance. This paper introduces Deoxys, a causal inference engine tailored to recommending mitigation actions for unhealthy node in cloud systems to minimize virtual machine downtime and interruptions during unhealthy events. It employs double machine learning combined with causal forest to produce precise and reliable mitigation recommendations based solely on limited observational data collected from the historical unhealthy events. To enhance the causal inference model, Deoxys further incorporates a policy fallback mechanism based on model uncertainty and action overriding mechanisms to (i) improve the reliability of the system, and (ii) strike a good tradeoff between downtime reduction and resource utilization, thereby enhancing the overall system performance. After deploying Deoxys in a large-scale cloud infrastructure at Microsoft, our observations demonstrate that Deoxys significantly reduces average VM downtime by 53% compared to a legacy policy, while leading to 49.5% lower VM interruption rate. This substantial improvement enhances the reliability and stability of cloud platforms, resulting in a seamless customer experience.
Markov Potential Game with Final-time Reach-Avoid Objectives
We formulate a Markov potential game with final-time reach-avoid objectives by integrating potential game theory with stochastic reach-avoid control. Our focus is on multi-player trajectory planning where players maximize the same multi-player reach-avoid objective: the probability of all participants reaching their designated target states by a specified time, while avoiding collisions with one another. Existing approaches require centralized computation of actions via a global policy, which may have prohibitively expensive communication costs. Instead, we focus on approximations of the global policy via local state feedback policies. First, we adapt the recursive single player reach-avoid value iteration to the multi-player framework with local policies, and show that the same recursion holds on the joint state space. To find each player's optimal local policy, the multi-player reach-avoid value function is projected from the joint state to the local state using the other players' occupancy measures. Then, we propose an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium. We demonstrate the utility of our approach in finding collision-free policies for multi-player motion planning in simulation.
comment: 8 pages, 2 figures
Accelerating soft-constrained MPC for linear systems through online constraint removal
Optimization-based controllers, such as Model Predictive Control (MPC), have attracted significant research interest due to their intuitive concept, constraint handling capabilities, and natural application to multi-input multi-output systems. However, the computational complexity of solving a receding horizon problem at each time step remains a challenge for the deployment of MPC. This is particularly the case for systems constrained by many inequalities. Recently, we introduced the concept of constraint-adaptive MPC (ca-MPC) to address this challenge for linear systems with hard constraints. In ca-MPC, at each time step, a subset of the constraints is removed from the optimization problem, thereby accelerating the optimization procedure, while resulting in identical closed-loop behavior. The present paper extends this framework to soft-constrained MPC by detecting and removing constraints based on sub-optimal predicted input sequences, which is rather easy for soft-constrained MPC due to the receding horizon principle and the inclusion of slack variables. We will translate these new ideas explicitly to an offset-free output tracking problem. The effectiveness of these ideas is demonstrated on a two-dimensional thermal transport model, showing a three order of magnitude improvement in online computational time of the MPC scheme.
comment: 6 pages, 5 figures, CDC 2023 conference
Approximate Kalman filtering for large-scale systems with an application to hyperthermia cancer treatments
Accurate state estimates are required for increasingly complex systems, to enable, for example, feedback control. However, available state estimation schemes are not necessarily real-time feasible for certain large-scale systems. Therefore, we develop in this paper, a real-time feasible state-estimation scheme for a class of large-scale systems that approximates the steady state Kalman filter. In particular, we focus on systems where the state-vector is the result of discretizing the spatial domain, as typically seen in Partial Differential Equations. In such cases, the correlation between states in the state-vector often have an intuitive interpretation on the spatial domain, which can be exploited to obtain a significant reduction in computational complexity, while still providing accurate state estimates. We illustrate these strengths of our method through a hyperthermia cancer treatment case study. The results of the case study show significant improvements in the computation time, while simultaneously obtaining good state estimates, when compared to Ensemble Kalman filters and Kalman filters using reduced-order models.
comment: 6 pages, 6 figures, CDC 2022 conference
Constraint Removal for MPC with Performance Preservation and a Hyperthermia Cancer Treatment Case Study
Model predictive control (MPC) is an optimization-based control strategy with broad industrial adoption. Unfortunately, the required computation time to solve the receding-horizon MPC optimization problem can become prohibitively large for many applications with a large number of state constraints. This large number of state constraints can, for instance, originate from spatially discretizing a partial differential equation of which the solution has to satisfy constraints over the full spatial domain. This is particularly the case in MPC for RF-based hyperthermia cancer treatments, which forms a strong motivation for this study. To address this problem, we propose a novel constraint-adaptive MPC framework for linear discrete-time systems. In this framework, we select at each time-step a subset of the state constraints that are included in the optimization problem, thereby reducing the online computational burden. Critically, our framework guarantees the same closed-loop performance, recursive feasibility, and constraint satisfaction properties as the original (non-reduced) MPC scheme. We achieve this result by efficiently exploiting reachable set computations and the MPC cost function. We will demonstrate our novel method using a hyperthermia cancer treatment case study showing a two-orders of magnitude improvement in computation time, with identical closed-loop performance as the original (non-reduced) MPC scheme.
comment: 6 pages, 3 figures, CDC 2021 conference
Exploiting Data Centres and Local Energy Communities Synergies for Market Participation
The evolving energy landscape has propelled energy communities to the forefront of modern energy management. However, existing research has yet to explore the potential synergies between data centres and energy communities, necessitating an assessment on their collective capabilities for cost efficiency, waste heat optimisation, and market participation. This paper presents a mixed integer linear programming model to assess the collaborative performance of energy communities, data centres and energy markets. The evaluation focuses on the efficient use of waste heat and the flexibility of job scheduling while minimising system energy costs and maintaining quality of service requirements for data centres. Our results, based on realistic profiles of an energy community and a data centre, showcase significant benefits of these synergies, with a 38% reduction in operating costs and an 87% decrease in heat demand.
comment: Submitted to IEEE PES ISGT Europe 2024
Risk-sensitive Affine Control Synthesis for Stationary LTI Systems
To address deviations from expected performance in stochastic systems, we propose a risk-sensitive control synthesis method to minimize certain risk measures over the limiting stationary distribution. Specifically, we extend Worst-case Conditional Value-at-Risk (W-CVaR) optimization for Linear Time-invariant (LTI) systems to handle nonzero-mean noise and affine controllers, using only the first and second moments of noise, which enhances robustness against model uncertainty. Highlighting the strong coupling between the linear and bias terms of the controller, we reformulate the synthesis problem as a Bilinear Matrix Inequality (BMI), and propose an alternating optimization algorithm with guaranteed convergence. Finally, we demonstrate the numerical performance of our approach in two representative settings, which shows that the proposed algorithm successfully synthesizes risk-sensitive controllers that outperform the na\"ive LQR baseline.
comment: 8 pages, 4 figures, 2 illustrations
Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System through Distributed Database and Multimodal Perception: Demonstrated in Crossroads
The autonomous driving industry is rapidly advancing, with Vehicle-to-Vehicle (V2V) communication systems highlighting as a key component of enhanced road safety and traffic efficiency. This paper introduces a novel Real-time Vehicle-to-Vehicle Communication Based Network Cooperative Control System (VVCCS), designed to revolutionize macro-scope traffic planning and collision avoidance in autonomous driving. Implemented on Quanser Car (Qcar) hardware platform, our system integrates the distributed databases into individual autonomous vehicles and an optional central server. We also developed a comprehensive multi-modal perception system with multi-objective tracking and radar sensing. Through a demonstration within a physical crossroad environment, our system showcases its potential to be applied in congested and complex urban environments.
comment: ICICT 2024, 18 pages
LEADS: Lightweight Embedded Assisted Driving System
With the rapid development of electric vehicles, formula races that face high school and university students have become more popular than ever as the threshold for design and manufacturing has been lowered. In many cases, we see teams inspired by or directly using toolkits and technologies inherited from standardized commercial vehicles. These architectures are usually overly complicated for amateur applications like the races. In order to improve the efficiency and simplify the development of instrumentation, control, and analysis systems, we propose LEADS (Lightweight Embedded Assisted Driving System), a dedicated solution for such scenarios.
comment: 12 pages
Electric Grid Topology and Admittance Estimation: Quantifying Phasor-based Measurement Requirements
In this paper, we quantify voltage and current phasor-based measurement requirements for the unique identification of the electric grid topology and admittance parameters. Our approach is underpinned by the concept of a rigidity matrix that has been extensively studied in graph rigidity theory. Specifically, we show that the rank of the rigidity matrix is the same as that of a voltage coefficient matrix in a corresponding electric power system. Accordingly, we show that there is a minimum number of measurements required to uniquely identify the admittance matrix and corresponding grid topology. By means of a numerical example on the IEEE 4-node radial network, we demonstrate that our approach is suitable for applications in electric power grids.
SPARC: Prediction-Based Safe Control for Coupled Controllable and Uncontrollable Agents with Conformal Predictions
We investigate the problem of safe control synthesis for systems operating in environments with uncontrollable agents whose dynamics are unknown but coupled with those of the controlled system. This scenario naturally arises in various applications, such as autonomous driving and human-robot collaboration, where the behavior of uncontrollable agents, like pedestrians, cannot be directly controlled but is influenced by the actions of the autonomous vehicle or robot. In this paper, we present SPARC (Safe Prediction-Based Robust Controller for Coupled Agents), a novel framework designed to ensure safe control in the presence of coupled uncontrollable agents. SPARC leverages conformal prediction to quantify uncertainty in data-driven prediction of agent behavior. Particularly, we introduce a joint distribution-based approach to account for the coupled dynamics of the controlled system and uncontrollable agents. By integrating the control barrier function (CBF) technique, SPARC provides provable safety guarantees at a high confidence level. We illustrate our framework with a case study involving an autonomous driving scenario with walking pedestrians.
comment: It's not complete yet
Addressing Trust Issues for Vehicle to Grid in Distributed Power Grids Using Blockchains
While blockchain offers inherent security, trust issues among stakeholders in vehicle-to-grid (V2G) applications remain unresolved due to a lack of regulatory frameworks and standardization. Additionally, a tailored decentralized privacy-preserved coordination scheme for blockchain in V2G networks is needed to ensure user privacy and efficient energy transactions. This paper proposes a V2G trading and coordination scheme tailored to the decentralized nature of blockchain as well as the interests of stakeholders utilizing smart charging points (SCPs) and Stackelberg game model. Case studies using real-world data from Southern University of Science and Technology demonstrate the efficacy of proposed scheme in reducing EV charging costs and the potential for supporting auxiliary grid services.
comment: This paper has been accepted by The 14th International Conference on Power and Energy Systems (ICPES 2024)
Certifiably Robust Policies for Uncertain Parametric Environments
We present a data-driven approach for producing policies that are provably robust across unknown stochastic environments. Existing approaches can learn models of a single environment as an interval Markov decision processes (IMDP) and produce a robust policy with a probably approximately correct (PAC) guarantee on its performance. However these are unable to reason about the impact of environmental parameters underlying the uncertainty. We propose a framework based on parametric Markov decision processes (MDPs) with unknown distributions over parameters. We learn and analyse IMDPs for a set of unknown sample environments induced by parameters. The key challenge is then to produce meaningful performance guarantees that combine the two layers of uncertainty: (1) multiple environments induced by parameters with an unknown distribution; (2) unknown induced environments which are approximated by IMDPs. We present a novel approach based on scenario optimisation that yields a single PAC guarantee quantifying the risk level for which a specified performance level can be assured in unseen environments, plus a means to trade-off risk and performance. We implement and evaluate our framework using multiple robust policy generation methods on a range of benchmarks. We show that our approach produces tight bounds on a policy's performance with high confidence.
Posterior Sampling-based Online Learning for Episodic POMDPs
Learning in POMDPs is known to be significantly harder than in MDPs. In this paper, we consider the online learning problem for episodic POMDPs with unknown transition and observation models. We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms for POMDPs. We show that the Bayesian regret of the proposed algorithm scales as the square root of the number of episodes and is polynomial in the other parameters. In a general setting, the regret scales exponentially in the horizon length $H$, and we show that this is inevitable by providing a lower bound. However, when the POMDP is undercomplete and weakly revealing (a common assumption in the recent literature), we establish a polynomial Bayesian regret bound. We finally propose a posterior sampling algorithm for multi-agent POMDPs, and show it too has sublinear regret.
comment: 41 pages, 9 figures
Mitigating Information Asymmetry in Two-Stage Contracts with Non-Myopic Agents
We consider a Stackelberg game in which a principal (she) establishes a two-stage contract with a non-myopic agent (he) whose type is unknown. The contract takes the form of an incentive function mapping the agent's first-stage action to his second-stage incentive. While the first-stage action reveals the agent's type under truthful play, a non-myopic agent could benefit from portraying a false type in the first stage to obtain a larger incentive in the second stage. The challenge is thus for the principal to design the incentive function so as to induce truthful play. We show that this is only possible with a constant, non-reactive incentive functions when the type space is continuous, whereas it can be achieved with reactive functions for discrete types. Additionally, we show that introducing an adjustment mechanism that penalizes inconsistent behavior across both stages allows the principal to design more flexible incentive functions.
comment: To appear in the Proceedings of the 5th IFAC Workshop on Cyber-Physical Human Systems
Robotics
Learning Precise, Contact-Rich Manipulation through Uncalibrated Tactile Skins
While visuomotor policy learning has advanced robotic manipulation, precisely executing contact-rich tasks remains challenging due to the limitations of vision in reasoning about physical interactions. To address this, recent work has sought to integrate tactile sensing into policy learning. However, many existing approaches rely on optical tactile sensors that are either restricted to recognition tasks or require complex dimensionality reduction steps for policy learning. In this work, we explore learning policies with magnetic skin sensors, which are inherently low-dimensional, highly sensitive, and inexpensive to integrate with robotic platforms. To leverage these sensors effectively, we present the Visuo-Skin (ViSk) framework, a simple approach that uses a transformer-based policy and treats skin sensor data as additional tokens alongside visual information. Evaluated on four complex real-world tasks involving credit card swiping, plug insertion, USB insertion, and bookshelf retrieval, ViSk significantly outperforms both vision-only and optical tactile sensing based policies. Further analysis reveals that combining tactile and visual modalities enhances policy performance and spatial generalization, achieving an average improvement of 27.5% across tasks. https://visuoskin.github.io/
Minimum-Violation Temporal Logic Planning for Heterogeneous Robots under Robot Skill Failures
In this paper, we consider teams of robots with heterogeneous skills (e.g., sensing and manipulation) tasked with collaborative missions described by Linear Temporal Logic (LTL) formulas. These LTL-encoded tasks require robots to apply their skills to specific regions and objects in a temporal and logical order. While existing temporal logic planning algorithms can synthesize correct-by-construction paths, they typically lack reactivity to unexpected failures of robot skills, which can compromise mission performance. This paper addresses this challenge by proposing a reactive LTL planning algorithm that adapts to unexpected failures during deployment. Specifically, the proposed algorithm reassigns sub-tasks to robots based on their functioning skills and locally revises team plans to accommodate these new assignments and ensure mission completion. The main novelty of the proposed algorithm is its ability to handle cases where mission completion becomes impossible due to limited functioning robots. Instead of reporting mission failure, the algorithm strategically prioritizes the most crucial sub-tasks and locally revises the team's plans, as per user-specified priorities, to minimize mission violations. We provide theoretical conditions under which the proposed framework computes the minimum violation task reassignments and team plans. We provide numerical and hardware experiments to demonstrate the efficiency of the proposed method.
DyPNIPP: Predicting Environment Dynamics for RL-based Robust Informative Path Planning
Informative path planning (IPP) is an important planning paradigm for various real-world robotic applications such as environment monitoring. IPP involves planning a path that can learn an accurate belief of the quantity of interest, while adhering to planning constraints. Traditional IPP methods typically require high computation time during execution, giving rise to reinforcement learning (RL) based IPP methods. However, the existing RL-based methods do not consider spatio-temporal environments which involve their own challenges due to variations in environment characteristics. In this paper, we propose DyPNIPP, a robust RL-based IPP framework, designed to operate effectively across spatio-temporal environments with varying dynamics. To achieve this, DyPNIPP incorporates domain randomization to train the agent across diverse environments and introduces a dynamics prediction model to capture and adapt the agent actions to specific environment dynamics. Our extensive experiments in a wildfire environment demonstrate that DyPNIPP outperforms existing RL-based IPP algorithms by significantly improving robustness and performing across diverse environment conditions.
comment: 8 pages, 4 figures, submitted to IEEE RA-L
Risk-Averse Model Predictive Control for Racing in Adverse Conditions
Model predictive control (MPC) algorithms can be sensitive to model mismatch when used in challenging nonlinear control tasks. In particular, the performance of MPC for vehicle control at the limits of handling suffers when the underlying model overestimates the vehicle's capabilities. In this work, we propose a risk-averse MPC framework that explicitly accounts for uncertainty over friction limits and tire parameters. Our approach leverages a sample-based approximation of an optimal control problem with a conditional value at risk (CVaR) constraint. This sample-based formulation enables planning with a set of expressive vehicle dynamics models using different tire parameters. Moreover, this formulation enables efficient numerical resolution via sequential quadratic programming and GPU parallelization. Experiments on a Lexus LC 500 show that risk-averse MPC unlocks reliable performance, while a deterministic baseline that plans using a single dynamics model may lose control of the vehicle in adverse road conditions.
Impact of 3D LiDAR Resolution in Graph-based SLAM Approaches: A Comparative Study
Simultaneous Localization and Mapping (SLAM) is a key component of autonomous systems operating in environments that require a consistent map for reliable localization. SLAM has been a widely studied topic for decades with most of the solutions being camera or LiDAR based. Early LiDAR-based approaches primarily relied on 2D data, whereas more recent frameworks use 3D data. In this work, we survey recent 3D LiDAR-based Graph-SLAM methods in urban environments, aiming to compare their strengths, weaknesses, and limitations. Additionally, we evaluate their robustness regarding the LiDAR resolution namely 64 $vs$ 128 channels. Regarding SLAM methods, we evaluate SC-LeGO-LOAM, SC-LIO-SAM, Cartographer, and HDL-Graph on real-world urban environments using the KITTI odometry dataset (a LiDAR with 64-channels only) and a new dataset (AUTONOMOS-LABS). The latter dataset, collected using instrumented vehicles driving in Berlin suburban area, comprises both 64 and 128 LiDARs. The experimental results are reported in terms of quantitative `metrics' and complemented by qualitative maps.
comment: This work has been accepted for publication in ROBOT24
Towards Map-Agnostic Policies for Adaptive Informative Path Planning
Robots are frequently tasked to gather relevant sensor data in unknown terrains. A key challenge for classical path planning algorithms used for autonomous information gathering is adaptively replanning paths online as the terrain is explored given limited onboard compute resources. Recently, learning-based approaches emerged that train planning policies offline and enable computationally efficient online replanning performing policy inference. These approaches are designed and trained for terrain monitoring missions assuming a single specific map representation, which limits their applicability to different terrains. To address these issues, we propose a novel formulation of the adaptive informative path planning problem unified across different map representations, enabling training and deploying planning policies in a larger variety of monitoring missions. Experimental results validate that our novel formulation easily integrates with classical non-learning-based planning approaches while maintaining their performance. Our trained planning policy performs similarly to state-of-the-art map-specifically trained policies. We validate our learned policy on unseen real-world terrain datasets.
comment: 8 pages, 4 figures
Layered LA-MAPF: a decomposition of large agent MAPF instance to accelerate solving without compromising solvability
Multi-Agent Path Finding (MAPF) has been widely studied in recent years. However, most existing MAPF algorithms assume that an agent occupies only a single grid in a grid-based map. This assumption limits their applicability in many real-world domains where agents have geometric shapes, rather than being point-like. Such agents, which can occupy multiple cells simultaneously, are referred to as ``large'' agents. When considering the shape and size of agents in MAPF, the computational complexity increases significantly as the number of agents grows, primarily due to the increased overhead in conflict detection between geometric agents. In this paper, we propose two types of subproblems for the LA-MAPF (Large-Agent MAPF) problem: \textbf{cluster} (which has no constraints on the order of solution) and \textbf{level} (which imposes constraints on the solution order). We introduce \textbf{Layered LA-MAPF}, a method that decomposes a MAPF instance involving geometric agents into clusters, and then further decomposes each cluster into levels. This approach aims to reduce time complexity when solving LA-MAPF problems. Our results demonstrate the performance of our method as the number of agents increases across various maps, and how it accelerates LA-MAPF methods, such as LA-CBS and LA-LaCAM. Experiments show that our LA-MAPF method with instance decomposition \textbf{halves the time cost (reducing from an average of 40s to 20s) and triples the success rate (from an average of 0.27 to 0.80)} in finding a solution within 60 seconds. To facilitate further research, we have made the source code for Layered LA-MAPF publicly available at \url{https://github.com/JoeYao-bit/LayeredMAPF/algorithm/LA-MAPF}.
Miniature magneto-oscillatory wireless sensor for magnetic field and gradient measurements
Magneto-oscillatory devices have been recently developed as very potent wireless miniature position trackers and sensors with an exceptional accuracy and sensing distance for surgical and robotic applications. However, it is still unclear to which extend a mechanically resonating sub-millimeter magnet interacts with external magnetic fields or gradients, which induce frequency shifts of sub-mHz to several Hz and therefore affect the sensing accuracy. Here, we investigate this effect experimentally on a cantilever-based magneto-oscillatory wireless sensor (MOWS) and build an analytical model concerning magnetic and mechanical interactions. The millimeter-scale MOWS is capable to detect magnetic fields with sub-uT resolution to at least +/- 5 mT, and simultaneously detects magnetic field gradients with a resolution of 65 uT/m to at least +/- 50 mT/m. The magnetic field sensitivity allows direct calculation of mechanical device properties, and by rotation, individual contributions of the magnetic field and gradient can be analyzed. The derived model is general and can be applied to other magneto-oscillatory systems interacting with magnetic environments.
comment: Main text: 7 pages with figures; Supplementary materials 6 pages with figures
Magneto-oscillatory localization for small-scale robots
Magnetism is widely used for the wireless localization and actuation of robots and devices for medical procedures. However, current static magnetic localization methods suffer from large required magnets and are limited to only five degrees of freedom due to a fundamental constraint of the rotational symmetry around the magnetic axis. We present the small-scale magneto-oscillatory localization (SMOL) method, which is capable of wirelessly localizing a millimeter-scale tracker with full six degrees of freedom in deep biological tissues. The SMOL device uses the temporal oscillation of a mechanically resonant cantilever with a magnetic dipole to break the rotational symmetry, and exploits the frequency-response to achieve a high signal-to-noise ratio with sub-millimeter accuracy over a large distance of up to 12 centimeters and quasi-continuous refresh rates up to 200 Hz. Integration into real-time closed-loop controlled robots and minimally-invasive surgical tools are demonstrated to reveal the vast potential of the SMOL method.
comment: Pages 1-35 main text (incl. 4 figures), pages 36-57 supplementary materials
E-3DGS: Gaussian Splatting with Exposure and Motion Events
Estimating Neural Radiance Fields (NeRFs) from images captured under optimal conditions has been extensively explored in the vision community. However, robotic applications often face challenges such as motion blur, insufficient illumination, and high computational overhead, which adversely affect downstream tasks like navigation, inspection, and scene visualization. To address these challenges, we propose E-3DGS, a novel event-based approach that partitions events into motion (from camera or object movement) and exposure (from camera exposure), using the former to handle fast-motion scenes and using the latter to reconstruct grayscale images for high-quality training and optimization of event-based 3D Gaussian Splatting (3DGS). We introduce a novel integration of 3DGS with exposure events for high-quality reconstruction of explicit scene representations. Our versatile framework can operate on motion events alone for 3D reconstruction, enhance quality using exposure events, or adopt a hybrid mode that balances quality and effectiveness by optimizing with initial exposure events followed by high-speed motion events. We also introduce EME-3D, a real-world 3D dataset with exposure events, motion events, camera calibration parameters, and sparse point clouds. Our method is faster and delivers better reconstruction quality than event-based NeRF while being more cost-effective than NeRF methods that combine event and RGB data by using a single event sensor. By combining motion and exposure events, E-3DGS sets a new benchmark for event-based 3D reconstruction with robust performance in challenging conditions and lower hardware demands. The source code and dataset will be available at https://github.com/MasterHow/E-3DGS.
comment: The source code and dataset will be available at https://github.com/MasterHow/E-3DGS
Proleptic Temporal Ensemble for Improving the Speed of Robot Tasks Generated by Imitation Learning
Imitation learning, which enables robots to learn behaviors from demonstrations by non-experts, has emerged as a promising solution for generating robot motions in such environments. The imitation learning based robot motion generation method, however, has the drawback of being limited by the demonstrators task execution speed. This paper presents a novel temporal ensemble approach applied to imitation learning algorithms, allowing for execution of future actions. The proposed method leverages existing demonstration data and pretrained policies, offering the advantages of requiring no additional computation and being easy to implement. The algorithms performance was validated through real world experiments involving robotic block color sorting, demonstrating up to 3x increase in task execution speed while maintaining a high success rate compared to the action chunking with transformer method. This study highlights the potential for significantly improving the performance of imitation learning-based policies, which were previously limited by the demonstrator's speed. It is expected to contribute substantially to future advancements in autonomous object manipulation technologies aimed at enhancing productivity.
comment: This paper has been submitted to the Journal of Korea Robotics Society and is currently under review
FlightAR: AR Flight Assistance Interface with Multiple Video Streams and Object Detection Aimed at Immersive Drone Control
The swift advancement of unmanned aerial vehicle (UAV) technologies necessitates new standards for developing human-drone interaction (HDI) interfaces. Most interfaces for HDI, especially first-person view (FPV) goggles, limit the operator's ability to obtain information from the environment. This paper presents a novel interface, FlightAR, that integrates augmented reality (AR) overlays of UAV first-person view (FPV) and bottom camera feeds with head-mounted display (HMD) to enhance the pilot's situational awareness. Using FlightAR, the system provides pilots not only with a video stream from several UAV cameras simultaneously, but also the ability to observe their surroundings in real time. User evaluation with NASA-TLX and UEQ surveys showed low physical demand ($\mu=1.8$, $SD = 0.8$) and good performance ($\mu=3.4$, $SD = 0.8$), proving better user assessments in comparison with baseline FPV goggles. Participants also rated the system highly for stimulation ($\mu=2.35$, $SD = 0.9$), novelty ($\mu=2.1$, $SD = 0.9$) and attractiveness ($\mu=1.97$, $SD = 1$), indicating positive user experiences. These results demonstrate the potential of the system to improve UAV piloting experience through enhanced situational awareness and intuitive control. The code is available here: https://github.com/Sautenich/FlightAR
comment: Manuscript accepted in IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024)
Direction-Constrained Control for Efficient Physical Human-Robot Interaction under Hierarchical Tasks
This paper proposes a control method to address the physical Human-Robot Interaction (pHRI) challenge in the context of hierarchical tasks. A common approach to managing hierarchical tasks is Hierarchical Quadratic Programming (HQP), which, however, cannot be directly applied to human interaction due to its allowance of arbitrary velocity direction adjustments. To resolve this limitation, we introduce the concept of directional constraints and develop a direction-constrained optimization algorithm to handle the nonlinearities induced by these constraints. The algorithm solves two sub-problems, minimizing the error and minimizing the deviation angle, in parallel, and combines the results of the two sub-problems to produce a final optimal outcome. The mutual influence between these two sub-problems is analyzed to determine the best parameter for combination. Additionally, the velocity objective in our control framework is computed using a variable admittance controller. Traditional admittance control does not account for constraints. To address this issue, we propose a variable admittance control method to adjust control objectives dynamically. The method helps reduce the deviation between robot velocity and human intention at the constraint boundaries, thereby enhancing interaction efficiency. We evaluate the proposed method in scenarios where a human operator physically interacts with a 7-degree-of-freedom robotic arm. The results highlight the importance of incorporating directional constraints in pHRI for hierarchical tasks. Compared to existing methods, our approach generates smoother robotic trajectories during interaction while avoiding interaction delays at the constraint boundaries.
EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI
In recent years, Large Language Models (LLMs) have demonstrated high reasoning capabilities, drawing attention for their applications as agents in various decision-making processes. One notably promising application of LLM agents is robotic manipulation. Recent research has shown that LLMs can generate text planning or control code for robots, providing substantial flexibility and interaction capabilities. However, these methods still face challenges in terms of flexibility and applicability across different environments, limiting their ability to adapt autonomously. Current approaches typically fall into two categories: those relying on environment-specific policy training, which restricts their transferability, and those generating code actions based on fixed prompts, which leads to diminished performance when confronted with new environments. These limitations significantly constrain the generalizability of agents in robotic manipulation. To address these limitations, we propose a novel method called EnvBridge. This approach involves the retention and transfer of successful robot control codes from source environments to target environments. EnvBridge enhances the agent's adaptability and performance across diverse settings by leveraging insights from multiple environments. Notably, our approach alleviates environmental constraints, offering a more flexible and generalizable solution for robotic manipulation tasks. We validated the effectiveness of our method using robotic manipulation benchmarks: RLBench, MetaWorld, and CALVIN. Our experiments demonstrate that LLM agents can successfully leverage diverse knowledge sources to solve complex tasks. Consequently, our approach significantly enhances the adaptability and robustness of robotic manipulation agents in planning across diverse environments.
Distribution of Responsibility During the Usage of AI-Based Exoskeletons for Upper Limb Rehabilitation IROS 2022
The ethical issues concerning the AI-based exoskeletons used in healthcare have already been studied literally rather than technically. How the ethical guidelines can be integrated into the development process has not been widely studied. However, this is one of the most important topics which should be studied more in real-life applications. Therefore, in this paper we highlight one ethical concern in the context of an exoskeleton used to train a user to perform a gesture: during the interaction between the exoskeleton, patient and therapist, how is the responsibility for decision making distributed? Based on the outcome of this, we will discuss how to integrate ethical guidelines into the development process of an AI-based exoskeleton. The discussion is based on a case study: AiBle. The different technical factors affecting the rehabilitation results and the human-machine interaction for AI-based exoskeletons are identified and discussed in this paper in order to better apply the ethical guidelines during the development of AI-based exoskeletons.
comment: Robot Trust for Symbiotic Societies (RTSS) at IROS 2022
Pedestrian motion prediction evaluation for urban autonomous driving
Pedestrian motion prediction is a key part of the modular-based autonomous driving pipeline, ensuring safe, accurate, and timely awareness of human agents' possible future trajectories. The autonomous vehicle can use this information to prevent any possible accidents and create a comfortable and pleasant driving experience for the passengers and pedestrians. A wealth of research was done on the topic from the authors of robotics, computer vision, intelligent transportation systems, and other fields. However, a relatively unexplored angle is the integration of the state-of-art solutions into existing autonomous driving stacks and evaluating them in real-life conditions rather than sanitized datasets. We analyze selected publications with provided open-source solutions and provide a perspective obtained by integrating them into existing Autonomous Driving framework - Autoware Mini and performing experiments in natural urban conditions in Tartu, Estonia to determine valuability of traditional motion prediction metrics. This perspective should be valuable to any potential autonomous driving or robotics engineer looking for the real-world performance of the existing state-of-art pedestrian motion prediction problem. The code with instructions on accessing the dataset is available at https://github.com/dmytrozabolotnii/autoware_mini.
comment: 7 pages, 2 figures, 4 tables This work has been submitted to the IEEE for possible publication
Guiding Reinforcement Learning with Incomplete System Dynamics
Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness.
Combining Ontological Knowledge and Large Language Model for User-Friendly Service Robots IROS2024
Lifestyle support through robotics is an increasingly promising field, with expectations for robots to take over or assist with chores like floor cleaning, table setting and clearing, and fetching items. The growth of AI, particularly foundation models, such as large language models (LLMs) and visual language models (VLMs), is significantly shaping this sector. LLMs, by facilitating natural interactions and providing vast general knowledge, are proving invaluable for robotic tasks. This paper zeroes in on the benefits of LLMs for "bring-me" tasks, where robots fetch specific items for users, often based on vague instructions. Our previous efforts utilized an ontology extended to handle environmental data to decipher such vagueness, but faced limitations when unresolvable ambiguities required user intervention for clarity. Here, we enhance our approach by integrating LLMs for providing additional commonsense knowledge, pairing it with ontological data to mitigate the issue of hallucinations and reduce the need for user queries, thus improving system usability. We present a system that merges these knowledge bases and assess its efficacy on "bring-me" tasks, aiming to provide a more seamless and efficient robotic assistance experience.
comment: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2024)
Sample-Efficient Curriculum Reinforcement Learning for Complex Reward Functions
Reinforcement learning (RL) shows promise in control problems, but its practical application is often hindered by the complexity arising from intricate reward functions with constraints. While the reward hypothesis suggests these competing demands can be encapsulated in a single scalar reward function, designing such functions remains challenging. Building on existing work, we start by formulating preferences over trajectories to derive a realistic reward function that balances goal achievement with constraint satisfaction in the application of mobile robotics with dynamic obstacles. To mitigate reward exploitation in such complex settings, we propose a novel two-stage reward curriculum combined with a flexible replay buffer that adaptively samples experiences. Our approach first learns on a subset of rewards before transitioning to the full reward, allowing the agent to learn trade-offs between objectives and constraints. After transitioning to a new stage, our method continues to make use of past experiences by updating their rewards for sample-efficient learning. We investigate the efficacy of our approach in robot navigation tasks and demonstrate superior performance compared to baselines in terms of true reward achievement and task completion, underlining its effectiveness.
Deep-Sea A*+: An Advanced Path Planning Method Integrating Enhanced A* and Dynamic Window Approach for Autonomous Underwater Vehicles
As terrestrial resources become increasingly depleted, the demand for deep-sea resource exploration has intensified. However, the extreme conditions in the deep-sea environment pose significant challenges for underwater operations, necessitating the development of robust detection robots. In this paper, we propose an advanced path planning methodology that integrates an improved A* algorithm with the Dynamic Window Approach (DWA). By optimizing the search direction of the traditional A* algorithm and introducing an enhanced evaluation function, our improved A* algorithm accelerates path searching and reduces computational load. Additionally, the path-smoothing process has been refined to improve continuity and smoothness, minimizing sharp turns. This method also integrates global path planning with local dynamic obstacle avoidance via DWA, improving the real-time response of underwater robots in dynamic environments. Simulation results demonstrate that our proposed method surpasses the traditional A* algorithm in terms of path smoothness, obstacle avoidance, and real-time performance. The robustness of this approach in complex environments with both static and dynamic obstacles highlights its potential in autonomous underwater vehicle (AUV) navigation and obstacle avoidance.
comment: Accepted by 2024 International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE 2024)
Fast State-of-Health Estimation Method for Lithium-ion Battery using Sparse Identification of Nonlinear Dynamics
Lithium-ion batteries (LIBs) are utilized as a major energy source in various fields because of their high energy density and long lifespan. During repeated charging and discharging, the degradation of LIBs, which reduces their maximum power output and operating time, is a pivotal issue. This degradation can affect not only battery performance but also safety of the system. Therefore, it is essential to accurately estimate the state-of-health (SOH) of the battery in real time. To address this problem, we propose a fast SOH estimation method that utilizes the sparse model identification algorithm (SINDy) for nonlinear dynamics. SINDy can discover the governing equations of target systems with low data assuming that few functions have the dominant characteristic of the system. To decide the state of degradation model, correlation analysis is suggested. Using SINDy and correlation analysis, we can obtain the data-driven SOH model to improve the interpretability of the system. To validate the feasibility of the proposed method, the estimation performance of the SOH and the computation time are evaluated by comparing it with various machine learning algorithms.
DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning
Running optimization across many parallel seeds leveraging GPU compute have relaxed the need for a good initialization, but this can fail if the problem is highly non-convex as all seeds could get stuck in local minima. One such setting is collision-free motion optimization for robot manipulation, where optimization converges quickly on easy problems but struggle in obstacle dense environments (e.g., a cluttered cabinet or table). In these situations, graph-based planning algorithms are used to obtain seeds, resulting in significant slowdowns. We propose DiffusionSeeder, a diffusion based approach that generates trajectories to seed motion optimization for rapid robot motion planning. DiffusionSeeder takes the initial depth image observation of the scene and generates high quality, multi-modal trajectories that are then fine-tuned with a few iterations of motion optimization. We integrate DiffusionSeeder to generate the seed trajectories for cuRobo, a GPU-accelerated motion optimization method, which results in 12x speed up on average, and 36x speed up for more complicated problems, while achieving 10% higher success rate in partially observed simulation environments. Our results show the effectiveness of using diverse solutions from a learned diffusion model. Physical experiments on a Franka robot demonstrate the sim2real transfer of DiffusionSeeder to the real robot, with an average success rate of 86% and planning time of 26ms, improving on cuRobo by 51% higher success rate while also being 2.5x faster.
DARE: Diffusion Policy for Autonomous Robot Exploration
Autonomous robot exploration requires a robot to efficiently explore and map unknown environments. Compared to conventional methods that can only optimize paths based on the current robot belief, learning-based methods show the potential to achieve improved performance by drawing on past experiences to reason about unknown areas. In this paper, we propose DARE, a novel generative approach that leverages diffusion models trained on expert demonstrations, which can explicitly generate an exploration path through one-time inference. We build DARE upon an attention-based encoder and a diffusion policy model, and introduce ground truth optimal demonstrations for training to learn better patterns for exploration. The trained planner can reason about the partial belief to recognize the potential structure in unknown areas and consider these areas during path planning. Our experiments demonstrate that DARE achieves on-par performance with both conventional and learning-based state-of-the-art exploration planners, as well as good generalizability in both simulations and real-life scenarios.
SERN: Simulation-Enhanced Realistic Navigation for Multi-Agent Robotic Systems in Contested Environments ICRA 2025
The increasing deployment of autonomous systems in complex environments necessitates efficient communication and task completion among multiple agents. This paper presents SERN (Simulation-Enhanced Realistic Navigation), a novel framework integrating virtual and physical environments for real-time collaborative decision-making in multi-robot systems. SERN addresses key challenges in asset deployment and coordination through a bi-directional communication framework using the AuroraXR ROS Bridge. Our approach advances the SOTA through accurate real-world representation in virtual environments using Unity high-fidelity simulator; synchronization of physical and virtual robot movements; efficient ROS data distribution between remote locations; and integration of SOTA semantic segmentation for enhanced environmental perception. Our evaluations show a 15% to 24% improvement in latency and up to a 15% increase in processing efficiency compared to traditional ROS setups. Real-world and virtual simulation experiments with multiple robots demonstrate synchronization accuracy, achieving less than 5 cm positional error and under 2-degree rotational error. These results highlight SERN's potential to enhance situational awareness and multi-agent coordination in diverse, contested environments.
comment: Under Review for ICRA 2025
QuasiNav: Asymmetric Cost-Aware Navigation Planning with Constrained Quasimetric Reinforcement Learning ICRA 2025
Autonomous navigation in unstructured outdoor environments is inherently challenging due to the presence of asymmetric traversal costs, such as varying energy expenditures for uphill versus downhill movement. Traditional reinforcement learning methods often assume symmetric costs, which can lead to suboptimal navigation paths and increased safety risks in real-world scenarios. In this paper, we introduce QuasiNav, a novel reinforcement learning framework that integrates quasimetric embeddings to explicitly model asymmetric costs and guide efficient, safe navigation. QuasiNav formulates the navigation problem as a constrained Markov decision process (CMDP) and employs quasimetric embeddings to capture directionally dependent costs, allowing for a more accurate representation of the terrain. This approach is combined with adaptive constraint tightening within a constrained policy optimization framework to dynamically enforce safety constraints during learning. We validate QuasiNav across three challenging navigation scenarios-undulating terrains, asymmetric hill traversal, and directionally dependent terrain traversal-demonstrating its effectiveness in both simulated and real-world environments. Experimental results show that QuasiNav significantly outperforms conventional methods, achieving higher success rates, improved energy efficiency, and better adherence to safety constraints.
comment: Under Review for ICRA 2025
Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies IROS 2024
Reinforcement learning (RL) policies are prone to high-frequency oscillations, especially undesirable when deploying to hardware in the real-world. In this paper, we identify, categorize, and compare methods from the literature that aim to mitigate high-frequency oscillations in deep RL. We define two broad classes: loss regularization and architectural methods. At their core, these methods incentivize learning a smooth mapping, such that nearby states in the input space produce nearby actions in the output space. We present benchmarks in terms of policy performance and control smoothness on traditional RL environments from the Gymnasium and a complex manipulation task, as well as three robotics locomotion tasks that include deployment and evaluation with real-world hardware. Finally, we also propose hybrid methods that combine elements from both loss regularization and architectural methods. We find that the best-performing hybrid outperforms other methods, and improves control smoothness by 26.8% over the baseline, with a worst-case performance degradation of just 2.8%.
comment: Presented in IROS 2024
MotionGlot: A Multi-Embodied Motion Generation Model
This paper introduces MotionGlot, a model that can generate motion across multiple embodiments with different action dimensions, such as quadruped robots and human bodies. By leveraging the well-established training procedures commonly used in large language models (LLMs), we introduce an instruction-tuning template specifically designed for motion-related tasks. Our approach demonstrates that the principles underlying LLM training can be successfully adapted to learn a wide range of motion generation tasks across multiple embodiments with different action dimensions. We demonstrate the various abilities of MotionGlot on a set of 6 tasks and report an average improvement of 35.3% across tasks. Additionally, we contribute two new datasets: (1) a dataset of expert-controlled quadruped locomotion with approximately 48,000 trajectories paired with direction-based text annotations, and (2) a dataset of over 23,000 situational text prompts for human motion generation tasks. Finally, we conduct hardware experiments to validate the capabilities of our system in real-world applications.
EnKode: Active Learning of Unknown Flows with Koopman Operators
In this letter, we address the task of adaptive sampling to model vector fields. When modeling environmental phenomena with a robot, gathering high resolution information can be resource intensive. Actively gathering data and modeling flows with the data is a more efficient alternative. However, in such scenarios, data is often sparse and thus requires flow modeling techniques that are effective at capturing the relevant dynamical features of the flow to ensure high prediction accuracy of the resulting models. To accomplish this effectively, regions with high informative value must be identified. We propose EnKode, an active sampling approach based on Koopman Operator theory and ensemble methods that can build high quality flow models and effectively estimate model uncertainty. For modeling complex flows, EnKode provides comparable or better estimates of unsampled flow regions than Gaussian Process Regression models with hyperparameter optimization. Additionally, our active sensing scheme provides more accurate flow estimates than comparable strategies that rely on uniform sampling. We evaluate EnKode using three common benchmarking systems: the Bickley Jet, Lid-Driven Cavity flow with an obstacle, and real ocean currents from the National Oceanic and Atmospheric Administration (NOAA).
comment: This work has been submitted to the IEEE for possible publication
Cycloidal Quasi-Direct Drive Actuator Designs with Learning-based Torque Estimation for Legged Robotics
This paper presents a novel approach through the design and implementation of Cycloidal Quasi-Direct Drive actuators for legged robotics. The cycloidal gear mechanism, with its inherent high torque density and mechanical robustness, offers significant advantages over conventional designs. By integrating cycloidal gears into the Quasi-Direct Drive framework, we aim to enhance the performance of legged robots, particularly in tasks demanding high torque and dynamic loads, while still keeping them lightweight. Additionally, we develop a torque estimation framework for the actuator using an Actuator Network, which effectively reduces the sim-to-real gap introduced by the cycloidal drive's complex dynamics. This integration is crucial for capturing the complex dynamics of a cycloidal drive, which contributes to improved learning efficiency, agility, and adaptability for reinforcement learning.
Composing Diffusion Policies for Few-shot Learning of Movement Trajectories
Humans can perform various combinations of physical skills without having to relearn skills from scratch every single time. For example, we can swing a bat when walking without having to re-learn such a policy from scratch by composing the individual skills of walking and bat swinging. Enabling robots to combine or compose skills is essential so they can learn novel skills and tasks faster with fewer real world samples. To this end, we propose a novel compositional approach called DSE- Diffusion Score Equilibrium that enables few-shot learning for novel skills by utilizing a combination of base policy priors. Our method is based on probabilistically composing diffusion policies to better model the few-shot demonstration data-distribution than any individual policy. Our goal here is to learn robot motions few-shot and not necessarily goal oriented trajectories. Unfortunately we lack a general purpose metric to evaluate the error between a skill or motion and the provided demonstrations. Hence, we propose a probabilistic measure - Maximum Mean Discrepancy on the Forward Kinematics Kernel (MMD-FK), that is task and action space agnostic. By using our few-shot learning approach DSE, we show that we are able to achieve a reduction of over 30% in MMD-FK across skills and number of demonstrations. Moreover, we show the utility of our approach through real world experiments by teaching novel trajectories to a robot in 5 demonstrations.
comment: 6(+1) pages, 6 figures
Configuração e operação da plataforma Clearpath Husky A200 e Manipulador Cobot UR5 2-Finger Gripper
This article presents initial configuration work and use of the robotic platform and manipulator in question. The development of the ideal configuration for using this robot serves as a guide for new users and also validates its functionality for use in projects. Husky is a large payload capacity and power systems robotics development platform that accommodates a wide variety of payloads, customized to meet research needs. Together with the Cobot UR5 Manipulator attached to its base, it expands the application area of its capacity in projects. Advances in robots and mobile manipulators have revolutionized industries by automating tasks that previously required human intervention. These innovations alone increase productivity but also reduce operating costs, which makes the company more competitive in an evolving global market. Therefore, this article investigates the functionalities of this robot to validate its execution in robotics projects.
comment: in Portuguese language
Interação entre robôs humanoides: desenvolvendo a colaboração e comunicação autônoma
This study investigates the interaction between humanoid robots NAO and Pepper, emphasizing their potential applications in educational settings. NAO, widely used in education, and Pepper, designed for social interactions, of er new opportunities for autonomous communication and collaboration. Through a series of programmed interactions, the robots demonstrated their ability to communicate and coordinate actions autonomously, highlighting their potential as tools for enhancing learning environments. The research also explores the integration of emerging technologies, such as artificial intelligence, into these systems, allowing robots to learn from each other and adapt their behavior. The findings suggest that NAO and Pepper can significantly contribute to both technical learning and the development of social and emotional skills in students, of ering innovative pedagogical approaches through the use of humanoid robotics.
comment: in Portuguese language
Real-time experiment-theory closed-loop interaction for autonomous materials science
Iterative cycles of theoretical prediction and experimental validation are the cornerstone of the modern scientific method. However, the proverbial "closing of the loop" in experiment-theory cycles in practice are usually ad hoc, often inherently difficult, or impractical to repeat on a systematic basis, beset by the scale or the time constraint of computation or the phenomena under study. Here, we demonstrate Autonomous MAterials Search Engine (AMASE), where we enlist robot science to perform self-driving continuous cyclical interaction of experiments and computational predictions for materials exploration. In particular, we have applied the AMASE formalism to the rapid mapping of a temperature-composition phase diagram, a fundamental task for the search and discovery of new materials. Thermal processing and experimental determination of compositional phase boundaries in thin films are autonomously interspersed with real-time updating of the phase diagram prediction through the minimization of Gibbs free energies. AMASE was able to accurately determine the eutectic phase diagram of the Sn-Bi binary thin-film system on the fly from a self-guided campaign covering just a small fraction of the entire composition - temperature phase space, translating to a 6-fold reduction in the number of necessary experiments. This study demonstrates for the first time the possibility of real-time, autonomous, and iterative interactions of experiments and theory carried out without any human intervention.
AG-SLAM: Active Gaussian Splatting SLAM
We present AG-SLAM, the first active SLAM system utilizing 3D Gaussian Splatting (3DGS) for online scene reconstruction. In recent years, radiance field scene representations, including 3DGS have been widely used in SLAM and exploration, but actively planning trajectories for robotic exploration is still unvisited. In particular, many exploration methods assume precise localization and thus do not mitigate the significant risk of constructing a trajectory, which is difficult for a SLAM system to operate on. This can cause camera tracking failure and lead to failures in real-world robotic applications. Our method leverages Fisher Information to balance the dual objectives of maximizing the information gain for the environment while minimizing the cost of localization errors. Experiments conducted on the Gibson and Habitat-Matterport 3D datasets demonstrate state-of-the-art results of the proposed method.
Geometric Graph Neural Network Modeling of Human Interactions in Crowded Environments
Modeling human trajectories in crowded environments is challenging due to the complex nature of pedestrian behavior and interactions. This paper proposes a geometric graph neural network (GNN) architecture that integrates domain knowledge from psychological studies to model pedestrian interactions and predict future trajectories. Unlike prior studies using complete graphs, we define interaction neighborhoods using pedestrians' field of view, motion direction, and distance-based kernel functions to construct graph representations of crowds. Evaluations across multiple datasets demonstrate improved prediction accuracy through reduced average and final displacement error metrics. Our findings underscore the importance of integrating domain knowledge with data-driven approaches for effective modeling of human interactions in crowds.
comment: \c{opyright} 2024 the authors. This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-ND
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution IROS 2024
Task planning for robots in real-life settings presents significant challenges. These challenges stem from three primary issues: the difficulty in identifying grounded sequences of steps to achieve a goal; the lack of a standardized mapping between high-level actions and low-level commands; and the challenge of maintaining low computational overhead given the limited resources of robotic hardware. We introduce EMPOWER, a framework designed for open-vocabulary online grounding and planning for embodied agents aimed at addressing these issues. By leveraging efficient pre-trained foundation models and a multi-role mechanism, EMPOWER demonstrates notable improvements in grounded planning and execution. Quantitative results highlight the effectiveness of our approach, achieving an average success rate of 0.73 across six different real-life scenarios using a TIAGo robot.
comment: Accepted at IROS 2024
3D-TAFS: A Training-free Framework for 3D Affordance Segmentation
Translating high-level linguistic instructions into precise robotic actions in the physical world remains challenging, particularly when considering the feasibility of interacting with 3D objects. In this paper, we introduce 3D-TAFS, a novel training-free multimodal framework for 3D affordance segmentation, alongside a benchmark for evaluating interactive language-guided affordance in everyday environments. In particular, our framework integrates a large multimodal model with a specialized 3D vision network, enabling seamless fusion of 2D and 3D visual understanding with language comprehension. To facilitate evaluation, we present a dataset of ten typical indoor environments, each with 50 images annotated for object actions and 3D affordance segmentation. Extensive experiments validate the proposed 3D-TAFS's capability in handling interactive 3D affordance segmentation tasks across diverse settings, showcasing competitive performance across various metrics. Our results highlight 3D-TAFS's potential for enhancing human-robot interaction based on affordance understanding in complex indoor environments, advancing the development of more intuitive and efficient robotic frameworks for real-world applications.
PhysORD: A Neuro-Symbolic Approach for Physics-infused Motion Prediction in Off-road Driving
Motion prediction is critical for autonomous off-road driving, however, it presents significantly more challenges than on-road driving because of the complex interaction between the vehicle and the terrain. Traditional physics-based approaches encounter difficulties in accurately modeling dynamic systems and external disturbance. In contrast, data-driven neural networks require extensive datasets and struggle with explicitly capturing the fundamental physical laws, which can easily lead to poor generalization. By merging the advantages of both methods, neuro-symbolic approaches present a promising direction. These methods embed physical laws into neural models, potentially significantly improving generalization capabilities. However, no prior works were evaluated in real-world settings for off-road driving. To bridge this gap, we present PhysORD, a neural-symbolic approach integrating the conservation law, i.e., the Euler-Lagrange equation, into data-driven neural models for motion prediction in off-road driving. Our experiments showed that PhysORD can accurately predict vehicle motion and tolerate external disturbance by modeling uncertainties. The learned dynamics model achieves 46.7% higher accuracy using only 3.1% of the parameters compared to data-driven methods, demonstrating the data efficiency and superior generalization ability of our neural-symbolic method.
AED: Adaptable Error Detection for Few-shot Imitation Policy NeurIPS2024
We introduce a new task called Adaptable Error Detection (AED), which aims to identify behavior errors in few-shot imitation (FSI) policies based on visual observations in novel environments. The potential to cause serious damage to surrounding areas limits the application of FSI policies in real-world scenarios. Thus, a robust system is necessary to notify operators when FSI policies are inconsistent with the intent of demonstrations. This task introduces three challenges: (1) detecting behavior errors in novel environments, (2) identifying behavior errors that occur without revealing notable changes, and (3) lacking complete temporal information of the rollout due to the necessity of online detection. However, the existing benchmarks cannot support the development of AED because their tasks do not present all these challenges. To this end, we develop a cross-domain AED benchmark, consisting of 322 base and 153 novel environments. Additionally, we propose Pattern Observer (PrObe) to address these challenges. PrObe is equipped with a powerful pattern extractor and guided by novel learning objectives to parse discernible patterns in the policy feature representations of normal or error states. Through our comprehensive evaluation, PrObe demonstrates superior capability to detect errors arising from a wide range of FSI policies, consistently surpassing strong baselines. Moreover, we conduct detailed ablations and a pilot study on error correction to validate the effectiveness of the proposed architecture design and the practicality of the AED task, respectively. The AED project page can be found at https://aed-neurips.github.io/.
comment: Accepted to NeurIPS2024
PRIMER: Perception-Aware Robust Learning-based Multiagent Trajectory Planner
In decentralized multiagent trajectory planners, agents need to communicate and exchange their positions to generate collision-free trajectories. However, due to localization errors/uncertainties, trajectory deconfliction can fail even if trajectories are perfectly shared between agents. To address this issue, we first present PARM and PARM*, perception-aware, decentralized, asynchronous multiagent trajectory planners that enable a team of agents to navigate uncertain environments while deconflicting trajectories and avoiding obstacles using perception information. PARM* differs from PARM as it is less conservative, using more computation to find closer-to-optimal solutions. While these methods achieve state-of-the-art performance, they suffer from high computational costs as they need to solve large optimization problems onboard, making it difficult for agents to replan at high rates. To overcome this challenge, we present our second key contribution, PRIMER, a learning-based planner trained with imitation learning (IL) using PARM* as the expert demonstrator. PRIMER leverages the low computational requirements at deployment of neural networks and achieves a computation speed up to 5500 times faster than optimization-based approaches.
comment: 7 pages, 3 figures
Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes
Dense scene reconstruction for photo-realistic view synthesis has various applications, such as VR/AR, autonomous vehicles. However, most existing methods have difficulties in large-scale scenes due to three core challenges: \textit{(a) inaccurate depth input.} Accurate depth input is impossible to get in real-world large-scale scenes. \textit{(b) inaccurate pose estimation.} Most existing approaches rely on accurate pre-estimated camera poses. \textit{(c) insufficient scene representation capability.} A single global radiance field lacks the capacity to effectively scale to large-scale scenes. To this end, we propose an incremental joint learning framework, which can achieve accurate depth, pose estimation, and large-scale scene reconstruction. A vision transformer-based network is adopted as the backbone to enhance performance in scale information estimation. For pose estimation, a feature-metric bundle adjustment (FBA) method is designed for accurate and robust camera tracking in large-scale scenes. In terms of implicit scene representation, we propose an incremental scene representation method to construct the entire large-scale scene as multiple local radiance fields to enhance the scalability of 3D scene representation. Extended experiments have been conducted to demonstrate the effectiveness and accuracy of our method in depth estimation, pose estimation, and large-scale scene reconstruction.
Consistent Distributed Cooperative Localization: A Coordinate Transformation Approach
This paper addresses the consistency issue of multi-robot distributed cooperative localization. We introduce a consistent distributed cooperative localization algorithm conducting state estimation in a transformed coordinate. The core idea involves a linear time-varying coordinated transformation to render the propagation Jacobian independent of the state and make it suitable for a distributed manner. This transformation is seamlessly integrated into a server-based distributed cooperative localization framework, in which each robot estimates its own state while the server maintains the cross-correlations. The transformation ensures the correct observability property of the entire framework. Moreover, the algorithm accommodates various types of robot-to-robot relative measurements, broadening its applicability. Through simulations and real-world dataset experiments, the proposed algorithm has demonstrated better performance in terms of both consistency and accuracy compared to existing algorithms.
Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation
Enabling mobile robots to perform long-term tasks in dynamic real-world environments is a formidable challenge, especially when the environment changes frequently due to human-robot interactions or the robot's own actions. Traditional methods typically assume static scenes, which limits their applicability in the continuously changing real world. To overcome these limitations, we present DovSG, a novel mobile manipulation framework that leverages dynamic open-vocabulary 3D scene graphs and a language-guided task planning module for long-term task execution. DovSG takes RGB-D sequences as input and utilizes vision-language models (VLMs) for object detection to obtain high-level object semantic features. Based on the segmented objects, a structured 3D scene graph is generated for low-level spatial relationships. Furthermore, an efficient mechanism for locally updating the scene graph, allows the robot to adjust parts of the graph dynamically during interactions without the need for full scene reconstruction. This mechanism is particularly valuable in dynamic environments, enabling the robot to continually adapt to scene changes and effectively support the execution of long-term tasks. We validated our system in real-world environments with varying degrees of manual modifications, demonstrating its effectiveness and superior performance in long-term tasks. Our project page is available at: https://bjhyzj.github.io/dovsg-web.
comment: 8 pages, 5 figures
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning NeurIPS 2024
In robot learning, the observation space is crucial due to the distinct characteristics of different modalities, which can potentially become a bottleneck alongside policy design. In this study, we explore the influence of various observation spaces on robot learning, focusing on three predominant modalities: RGB, RGB-D, and point cloud. We introduce OBSBench, a benchmark comprising two simulators and 125 tasks, along with standardized pipelines for various encoders and policy baselines. Extensive experiments on diverse contact-rich manipulation tasks reveal a notable trend: point cloud-based methods, even those with the simplest designs, frequently outperform their RGB and RGB-D counterparts. This trend persists in both scenarios: training from scratch and utilizing pre-training. Furthermore, our findings demonstrate that point cloud observations often yield better policy performance and significantly stronger generalization capabilities across various geometric and visual conditions. These outcomes suggest that the 3D point cloud is a valuable observation modality for intricate robotic tasks. We also suggest that incorporating both appearance and coordinate information can enhance the performance of point cloud methods. We hope our work provides valuable insights and guidance for designing more generalizable and robust robotic models. Codes are available at https://github.com/HaoyiZhu/PointCloudMatters.
comment: 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks
Is Your HD Map Constructor Reliable under Sensor Corruptions? NeurIPS 2024
Driving systems often rely on high-definition (HD) maps for precise environmental information, which is crucial for planning and navigation. While current HD map constructors perform well under ideal conditions, their resilience to real-world challenges, \eg, adverse weather and sensor failures, is not well understood, raising safety concerns. This work introduces MapBench, the first comprehensive benchmark designed to evaluate the robustness of HD map construction methods against various sensor corruptions. Our benchmark encompasses a total of 29 types of corruptions that occur from cameras and LiDAR sensors. Extensive evaluations across 31 HD map constructors reveal significant performance degradation of existing methods under adverse weather conditions and sensor failures, underscoring critical safety concerns. We identify effective strategies for enhancing robustness, including innovative approaches that leverage multi-modal fusion, advanced data augmentation, and architectural techniques. These insights provide a pathway for developing more reliable HD map construction methods, which are essential for the advancement of autonomous driving technology. The benchmark toolkit and affiliated code and model checkpoints have been made publicly accessible.
comment: NeurIPS 2024; 40 pages, 17 figures, 23 tables; Code at https://mapbench.github.io/
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC ICRA
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses.
comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation
Autonomous Driving Systems (ADS) require diverse and safety-critical traffic scenarios for effective training and testing, but the existing data generation methods struggle to provide flexibility and scalability. We propose LASER, a novel frame-work that leverage large language models (LLMs) to conduct traffic simulations based on natural language inputs. The framework operates in two stages: it first generates scripts from user-provided descriptions and then executes them using autonomous agents in real time. Validated in the CARLA simulator, LASER successfully generates complex, on-demand driving scenarios, significantly improving ADS training and testing data generation.
Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
comment: 6 pages, 8 figures
Speech to Reality: On-Demand Production using Natural Language, 3D Generative AI, and Discrete Robotic Assembly
We present a system that transforms speech into physical objects by combining 3D generative Artificial Intelligence with robotic assembly. The system leverages natural language input to make design and manufacturing more accessible, enabling individuals without expertise in 3D modeling or robotic programming to create physical objects. We propose utilizing discrete robotic assembly of lattice-based voxel components to address the challenges of using generative AI outputs in physical production, such as design variability, fabrication speed, structural integrity, and material waste. The system interprets speech to generate 3D objects, discretizes them into voxel components, computes an optimized assembly sequence, and generates a robotic toolpath. The results are demonstrated through the assembly of various objects, ranging from chairs to shelves, which are prompted via speech and realized within 5 minutes using a 6-axis robotic arm.
comment: This work has been submitted to the IEEE for possible publication. An updated version will replace this version
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments IROS
Autonomous robots exploring unknown environments face a significant challenge: navigating effectively without prior maps and with limited external feedback. This challenge intensifies in sparse reward environments, where traditional exploration techniques often fail. In this paper, we present TopoNav, a novel topological navigation framework that integrates active mapping, hierarchical reinforcement learning, and intrinsic motivation to enable efficient goal-oriented exploration and navigation in sparse-reward settings. TopoNav dynamically constructs a topological map of the environment, capturing key locations and pathways. A two-level hierarchical policy architecture, comprising a high-level graph traversal policy and low-level motion control policies, enables effective navigation and obstacle avoidance while maintaining focus on the overall goal. Additionally, TopoNav incorporates intrinsic motivation to guide exploration toward relevant regions and frontier nodes in the topological map, addressing the challenges of sparse extrinsic rewards. We evaluate TopoNav both in the simulated and real-world off-road environments using a Clearpath Jackal robot, across three challenging navigation scenarios: goal-reaching, feature-based navigation, and navigation in complex terrains. We observe an increase in exploration coverage by 7- 20%, in success rates by 9-19%, and reductions in navigation times by 15-36% across various scenarios, compared to state-of-the-art methods
comment: Accepted at the 37th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024
Diffusion-Reward Adversarial Imitation Learning
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a generator policy learning to imitate expert behaviors and discriminator learning to distinguish the expert demonstrations from agent trajectories. Despite its encouraging results, GAIL training is often brittle and unstable. Inspired by the recent dominance of diffusion models in generative modeling, we propose Diffusion-Reward Adversarial Imitation Learning (DRAIL), which integrates a diffusion model into GAIL, aiming to yield more robust and smoother rewards for policy learning. Specifically, we propose a diffusion discriminative classifier to construct an enhanced discriminator, and design diffusion rewards based on the classifier's output for policy learning. Extensive experiments are conducted in navigation, manipulation, and locomotion, verifying DRAIL's effectiveness compared to prior imitation learning methods. Moreover, additional experimental results demonstrate the generalizability and data efficiency of DRAIL. Visualized learned reward functions of GAIL and DRAIL suggest that DRAIL can produce more robust and smoother rewards. Project page: https://nturobotlearninglab.github.io/DRAIL/
Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control
Empowering resource-limited robots to execute computationally intensive tasks such as locomotion and manipulation is challenging. This project provides a comprehensive design space exploration to determine optimal hardware computation architectures suitable for model-based control algorithms. We profile and optimize representative architectural designs across general-purpose scalar, vector processors, and specialized accelerators. Specifically, we compare CPUs, vector machines, and domain-specialized accelerators with kernel-level benchmarks and end-to-end representative robotic workloads. Our exploration provides a quantitative performance, area, and utilization comparison and analyzes the trade-offs between these representative distinct architectural designs. We demonstrate that architectural modifications, software, and system optimization can alleviate bottlenecks and enhance utilization. Finally, we propose a code generation flow to simplify the engineering work for mapping robotic workloads to specialized architectures.
D2S: Representing sparse descriptors and 3D coordinates for camera relocalization
State-of-the-art visual localization methods mostly rely on complex procedures to match local descriptors and 3D point clouds. However, these procedures can incur significant costs in terms of inference, storage, and updates over time. In this study, we propose a direct learning-based approach that utilizes a simple network named D2S to represent complex local descriptors and their scene coordinates. Our method is characterized by its simplicity and cost-effectiveness. It solely leverages a single RGB image for localization during the testing phase and only requires a lightweight model to encode a complex sparse scene. The proposed D2S employs a combination of a simple loss function and graph attention to selectively focus on robust descriptors while disregarding areas such as clouds, trees, and several dynamic objects. This selective attention enables D2S to effectively perform a binary-semantic classification for sparse descriptors. Additionally, we propose a simple outdoor dataset to evaluate the capabilities of visual localization methods in scene-specific generalization and self-updating from unlabeled observations. Our approach outperforms the previous regression-based methods in both indoor and outdoor environments. It demonstrates the ability to generalize beyond training data, including scenarios involving transitions from day to night and adapting to domain shifts. The source code, trained models, dataset, and demo videos are available at the following link: https://thpjp.github.io/d2s.
comment: Accepted to IEEE Robotics and Automation Letters
A Propagation Perspective on Recursive Forward Dynamics for Systems with Kinematic Loops
We revisit the concept of constraint embedding as a means for dealing with kinematic loop constraints during dynamics computations for rigid-body systems. Specifically, we consider the local loop constraints emerging from common actuation sub-mechanisms in modern robotics systems (e.g., geared motors, differential drives, and four-bar mechanisms). However, rather than develop the concept of constraint embedding from the perspective of graphical analysis, we present a novel analysis of constraint embedding that generalizes the traditional concepts of joint models and motion/force subspaces between individual rigid bodies to generalized joint models and motion/force subspaces between groups of rigid bodies subject to loop constraints. The generalized concepts are used in a self-contained, articulated-body-based derivation of the constraint-embedding-based recursive algorithm for forward dynamics. The derivation represents the first assembly method to demonstrate the recursivity of articulated inertia computation in the presence of loop constraints. We demonstrate the broad applicability of the generalized joint concepts by showing how they also lead to the constraint-embedding-based recursive algorithm for inverse dynamics. Lastly, we benchmark our open-source implementation in C++ for the forward dynamics algorithm against a state-of-the-art, non-recursive algorithm. Our benchmarking validates that constraint embedding outperforms the non-recursive alternative in the case of local kinematic loops.
comment: Submitted to IEEE Transactions on Robotics
Robust High-Speed State Estimation for Off-road Navigation using Radar Velocity Factors
Enabling robot autonomy in complex environments for mission critical application requires robust state estimation. Particularly under conditions where the exteroceptive sensors, which the navigation depends on, can be degraded by environmental challenges thus, leading to mission failure. It is precisely in such challenges where the potential for FMCW radar sensors is highlighted: as a complementary exteroceptive sensing modality with direct velocity measuring capabilities. In this work we integrate radial speed measurements from a FMCW radar sensor, using a radial speed factor, to provide linear velocity updates into a sliding-window state estimator for fusion with LiDAR pose and IMU measurements. We demonstrate that this augmentation increases the robustness of the state estimator to challenging conditions present in the environment and the negative effects they can pose to vulnerable exteroceptive modalities. The proposed method is extensively evaluated using robotic field experiments conducted using an autonomous, full-scale, off-road vehicle operating at high-speeds (~12 m/s) in complex desert environments. Furthermore, the robustness of the approach is demonstrated for cases of both simulated and real-world degradation of the LiDAR odometry performance along with comparison against state-of-the-art methods for radar-inertial odometry on public datasets.
comment: 8 pages, 9 figures. Accepted for publication in IEEE Robotics and Automation Letters (RA-L), 2024
Counter-Hypothetical Particle Filters for Single Object Pose Tracking ICRA
Particle filtering is a common technique for six degrees of freedom (6D) pose estimation due to its ability to tractably represent belief over object pose. However, the particle filter is prone to particle deprivation due to the high-dimensional nature of 6D pose. When particle deprivation occurs, it can cause mode collapse of the underlying belief distribution during importance sampling. If the region surrounding the true state suffers from mode collapse, recovering its belief is challenging since the area is no longer represented in the probability mass formed by the particles. Previous methods mitigate this problem by randomizing and resetting particles in the belief distribution, but determining the frequency of reinvigoration has relied on hand-tuning abstract heuristics. In this paper, we estimate the necessary reinvigoration rate at each time step by introducing a Counter-Hypothetical likelihood function, which is used alongside the standard likelihood. Inspired by the notions of plausibility and implausibility from Evidential Reasoning, the addition of our Counter-Hypothetical likelihood function assigns a level of doubt to each particle. The competing cumulative values of confidence and doubt across the particle set are used to estimate the level of failure within the filter, in order to determine the portion of particles to be reinvigorated. We demonstrate the effectiveness of our method on the rigid body object 6D pose tracking task.
comment: International Conference on Robotics and Automation (ICRA) 2023
Multiagent Systems
Scalable spectral representations for network multiagent control
Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent. Building on these local spectral representations, we design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. Empirically, we validate the effectiveness of our scalable representation-based approach on two benchmark problems, and demonstrate the advantages of our approach over generic function approximation approaches to representing the local $Q$-functions.
Delay-Constrained Grant-Free Random Access in MIMO Systems: Distributed Pilot Allocation and Power Control
We study a delay-constrained grant-free random access system with a multi-antenna base station. The users randomly generate data packets with expiration deadlines, which are then transmitted from data queues on a first-in first-out basis. To deliver a packet, a user needs to succeed in both random access phase (sending a pilot without collision) and data transmission phase (achieving a required data rate with imperfect channel information) before the packet expires. We develop a distributed, cross-layer policy that allows the users to dynamically and independently choose their pilots and transmit powers to achieve a high effective sum throughput with fairness consideration. Our policy design involves three key components: 1) a proxy of the instantaneous data rate that depends only on macroscopic environment variables and transmission decisions, considering pilot collisions and imperfect channel estimation; 2) a quantitative, instantaneous measure of fairness within each communication round; and 3) a deep learning-based, multi-agent control framework with centralized training and distributed execution. The proposed framework benefits from an accurate, differentiable objective function for training, thereby achieving a higher sample efficiency compared with a conventional application of model-free, multi-agent reinforcement learning algorithms. The performance of the proposed approach is verified by simulations under highly dynamic and heterogeneous scenarios.
comment: 15 pages, 7 figures. Accepted for publication in IEEE Transactions on Cognitive Communications and Networking
SERN: Simulation-Enhanced Realistic Navigation for Multi-Agent Robotic Systems in Contested Environments ICRA 2025
The increasing deployment of autonomous systems in complex environments necessitates efficient communication and task completion among multiple agents. This paper presents SERN (Simulation-Enhanced Realistic Navigation), a novel framework integrating virtual and physical environments for real-time collaborative decision-making in multi-robot systems. SERN addresses key challenges in asset deployment and coordination through a bi-directional communication framework using the AuroraXR ROS Bridge. Our approach advances the SOTA through accurate real-world representation in virtual environments using Unity high-fidelity simulator; synchronization of physical and virtual robot movements; efficient ROS data distribution between remote locations; and integration of SOTA semantic segmentation for enhanced environmental perception. Our evaluations show a 15% to 24% improvement in latency and up to a 15% increase in processing efficiency compared to traditional ROS setups. Real-world and virtual simulation experiments with multiple robots demonstrate synchronization accuracy, achieving less than 5 cm positional error and under 2-degree rotational error. These results highlight SERN's potential to enhance situational awareness and multi-agent coordination in diverse, contested environments.
comment: Under Review for ICRA 2025
Cutting Through the Confusion and Hype: Understanding the True Potential of Generative AI
This paper explores the nuanced landscape of generative AI (genAI), particularly focusing on neural network-based models like Large Language Models (LLMs). While genAI garners both optimistic enthusiasm and sceptical criticism, this work seeks to provide a balanced examination of its capabilities, limitations, and the profound impact it may have on societal functions and personal interactions. The first section demystifies language-based genAI through detailed discussions on how LLMs learn, their computational needs, distinguishing features from supporting technologies, and the inherent limitations in their accuracy and reliability. Real-world examples illustrate the practical applications and implications of these technologies. The latter part of the paper adopts a systems perspective, evaluating how the integration of LLMs with existing technologies can enhance productivity and address emerging concerns. It highlights the need for significant investment to understand the implications of recent advancements, advocating for a well-informed dialogue to ethically and responsibly integrate genAI into diverse sectors. The paper concludes with prospective developments and recommendations, emphasizing a forward-looking approach to harnessing genAI`s potential while mitigating its risks.
Convex Markov Games: A Framework for Fairness, Imitation, and Creativity in Multi-Agent Learning
Expert imitation, behavioral diversity, and fairness preferences give rise to preferences in sequential decision making domains that do not decompose additively across time. We introduce the class of convex Markov games that allow general convex preferences over occupancy measures. Despite infinite time horizon and strictly higher generality than Markov games, pure strategy Nash equilibria exist under strict convexity. Furthermore, equilibria can be approximated efficiently by performing gradient descent on an upper bound of exploitability. Our experiments imitate human choices in ultimatum games, reveal novel solutions to the repeated prisoner's dilemma, and find fair solutions in a repeated asymmetric coordination game. In the prisoner's dilemma, our algorithm finds a policy profile that deviates from observed human play only slightly, yet achieves higher per-player utility while also being three orders of magnitude less exploitable.
Evolution with Opponent-Learning Awareness
The universe involves many independent co-learning agents as an ever-evolving part of our observed environment. Yet, in practice, Multi-Agent Reinforcement Learning (MARL) applications are usually constrained to small, homogeneous populations and remain computationally intensive. In this paper, we study how large heterogeneous populations of learning agents evolve in normal-form games. We show how, under assumptions commonly made in the multi-armed bandit literature, Multi-Agent Policy Gradient closely resembles the Replicator Dynamic, and we further derive a fast, parallelizable implementation of Opponent-Learning Awareness tailored for evolutionary simulations. This enables us to simulate the evolution of very large populations made of heterogeneous co-learning agents, under both naive and advanced learning strategies. We demonstrate our approach in simulations of 200,000 agents, evolving in the classic games of Hawk-Dove, Stag-Hunt, and Rock-Paper-Scissors. Each game highlights distinct ways in which Opponent-Learning Awareness affects evolution.
comment: 12 pages, 10 figures
Cooperative Multi-Agent Constrained Stochastic Linear Bandits
In this study, we explore a collaborative multi-agent stochastic linear bandit setting involving a network of $N$ agents that communicate locally to minimize their collective regret while keeping their expected cost under a specified threshold $\tau$. Each agent encounters a distinct linear bandit problem characterized by its own reward and cost parameters, i.e., local parameters. The goal of the agents is to determine the best overall action corresponding to the average of these parameters, or so-called global parameters. In each round, an agent is randomly chosen to select an action based on its current knowledge of the system. This chosen action is then executed by all agents, then they observe their individual rewards and costs. We propose a safe distributed upper confidence bound algorithm, so called \textit{MA-OPLB}, and establish a high probability bound on its $T$-round regret. MA-OPLB utilizes an accelerated consensus method, where agents can compute an estimate of the average rewards and costs across the network by communicating the proper information with their neighbors. We show that our regret bound is of order $ \mathcal{O}\left(\frac{d}{\tau-c_0}\frac{\log(NT)^2}{\sqrt{N}}\sqrt{\frac{T}{\log(1/|\lambda_2|)}}\right)$, where $\lambda_2$ is the second largest (in absolute value) eigenvalue of the communication matrix, and $\tau-c_0$ is the known cost gap of a feasible action. We also experimentally show the performance of our proposed algorithm in different network structures.
Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning NeurIPS 2024
Understanding cognitive processes in multi-agent interactions is a primary goal in cognitive science. It can guide the direction of artificial intelligence (AI) research toward social decision-making in multi-agent systems, which includes uncertainty from character heterogeneity. In this paper, we introduce an episodic future thinking (EFT) mechanism for a reinforcement learning (RL) agent, inspired by cognitive processes observed in animals. To enable future thinking functionality, we first develop a multi-character policy that captures diverse characters with an ensemble of heterogeneous policies. Here, the character of an agent is defined as a different weight combination on reward components, representing distinct behavioral preferences. The future thinking agent collects observation-action trajectories of the target agents and uses the pre-trained multi-character policy to infer their characters. Once the character is inferred, the agent predicts the upcoming actions of target agents and simulates the potential future scenario. This capability allows the agent to adaptively select the optimal action, considering the predicted future scenario in multi-agent interactions. To evaluate the proposed mechanism, we consider the multi-agent autonomous driving scenario with diverse driving traits and multiple particle environments. Simulation results demonstrate that the EFT mechanism with accurate character inference leads to a higher reward than existing multi-agent solutions. We also confirm that the effect of reward improvement remains valid across societies with different levels of character diversity.
comment: NeurIPS 2024 (Web: https://sites.google.com/view/eftm-neurips2024)
Hierarchical Multi-agent Reinforcement Learning for Cyber Network Defense AAMAS
Recent advances in multi-agent reinforcement learning (MARL) have created opportunities to solve complex real-world tasks. Cybersecurity is a notable application area, where defending networks against sophisticated adversaries remains a challenging task typically performed by teams of security operators. In this work, we explore novel MARL strategies for building autonomous cyber network defenses that address challenges such as large policy spaces, partial observability, and stealthy, deceptive adversarial strategies. To facilitate efficient and generalized learning, we propose a hierarchical Proximal Policy Optimization (PPO) architecture that decomposes the cyber defense task into specific sub-tasks like network investigation and host recovery. Our approach involves training sub-policies for each sub-task using PPO enhanced with domain expertise. These sub-policies are then leveraged by a master defense policy that coordinates their selection to solve complex network defense tasks. Furthermore, the sub-policies can be fine-tuned and transferred with minimal cost to defend against shifts in adversarial behavior or changes in network settings. We conduct extensive experiments using CybORG Cage 4, the state-of-the-art MARL environment for cyber defense. Comparisons with multiple baselines across different adversaries show that our hierarchical learning approach achieves top performance in terms of convergence speed, episodic return, and several interpretable metrics relevant to cybersecurity, including the fraction of clean machines on the network, precision, and false positives on recoveries.
comment: 9 pages, 7 figures, AAMAS preprint
Self-Evolving Multi-Agent Collaboration Networks for Software Development
LLM-driven multi-agent collaboration (MAC) systems have demonstrated impressive capabilities in automatic software development at the function level. However, their heavy reliance on human design limits their adaptability to the diverse demands of real-world software development. To address this limitation, we introduce EvoMAC, a novel self-evolving paradigm for MAC networks. Inspired by traditional neural network training, EvoMAC obtains text-based environmental feedback by verifying the MAC network's output against a target proxy and leverages a novel textual backpropagation to update the network. To extend coding capabilities beyond function-level tasks to more challenging software-level development, we further propose rSDE-Bench, a requirement-oriented software development benchmark, which features complex and diverse software requirements along with automatic evaluation of requirement correctness. Our experiments show that: i) The automatic requirement-aware evaluation in rSDE-Bench closely aligns with human evaluations, validating its reliability as a software-level coding benchmark. ii) EvoMAC outperforms previous SOTA methods on both the software-level rSDE-Bench and the function-level HumanEval benchmarks, reflecting its superior coding capabilities. The benchmark can be downloaded at https://yuzhu-cai.github.io/rSDE-Bench/.
comment: 25 pages
Persistent synchronization of heterogeneous networks with time-dependent linear diffusive coupling
We study synchronization for linearly coupled temporal networks of heterogeneous time-dependent nonlinear agents via the convergence of attracting trajectories of each node. The results are obtained by constructing and studying the stability of a suitable linear nonautonomous problem bounding the evolution of the synchronization errors. Both, the case of the entire network and only a cluster, are addressed and the persistence of the obtained synchronization against perturbation is also discussed. Furthermore, a sufficient condition for the existence of attracting trajectories of each node is given. In all cases, the considered dependence on time requires only local integrability, which is a very mild regularity assumption. Moreover, our results mainly depend on the network structure and its properties, and achieve synchronization up to a constant in finite time. Hence they are quite suitable for applications. The applicability of the results is showcased via several examples: coupled van-der-Pol/FitzHugh-Nagumo oscillators, weighted/signed opinion dynamics, and coupled Lorenz systems.
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
As machine intelligence evolves, the need to test and compare the problem-solving abilities of different AI models grows. However, current benchmarks are often overly simplistic, allowing models to perform uniformly well, making it difficult to distinguish their capabilities. Additionally, benchmarks typically rely on static question-answer pairs, which models might memorize or guess. To address these limitations, we introduce the Dynamic Intelligence Assessment (DIA), a novel methodology for testing AI models using dynamic question templates and improved metrics across multiple disciplines such as mathematics, cryptography, cybersecurity, and computer science. The accompanying DIA-Bench dataset, which includes 150 diverse and challenging task templates with mutable parameters, is presented in various formats such as text, PDFs, compiled binaries, and visual puzzles. Our framework introduces four new metrics to assess a model's reliability and confidence across multiple attempts. These metrics revealed that even simple questions are frequently answered incorrectly when posed in varying forms, highlighting significant gaps in models' reliability. Notably, models like GPT-4o tended to overestimate their mathematical abilities, while ChatGPT-4o demonstrated better decision-making and performance through effective tool usage. We evaluated eight state-of-the-art large language models (LLMs) using DIA-Bench, showing that current models struggle with complex tasks and often display unexpectedly low confidence, even with simpler questions. The DIA framework sets a new standard for assessing not only problem-solving but also a model's adaptive intelligence and ability to assess its own limitations. The dataset is publicly available on our project's website.
LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation
Autonomous Driving Systems (ADS) require diverse and safety-critical traffic scenarios for effective training and testing, but the existing data generation methods struggle to provide flexibility and scalability. We propose LASER, a novel frame-work that leverage large language models (LLMs) to conduct traffic simulations based on natural language inputs. The framework operates in two stages: it first generates scripts from user-provided descriptions and then executes them using autonomous agents in real time. Validated in the CARLA simulator, LASER successfully generates complex, on-demand driving scenarios, significantly improving ADS training and testing data generation.
Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale
We study reinforcement learning for global decision-making in the presence of local agents, where the global decision-maker makes decisions affecting all local agents, and the objective is to learn a policy that maximizes the joint rewards of all the agents. Such problems find many applications, e.g. demand response, EV charging, queueing, etc. In this setting, scalability has been a long-standing challenge due to the size of the state space which can be exponential in the number of agents. This work proposes the \texttt{SUBSAMPLE-Q} algorithm where the global agent subsamples $k\leq n$ local agents to compute a policy in time that is polynomial in $k$. We show that this learned policy converges to the optimal policy in the order of $\tilde{O}(1/\sqrt{k}+{\epsilon}_{k,m})$ as the number of sub-sampled agents $k$ increases, where ${\epsilon}_{k,m}$ is the Bellman noise. Finally, we validate the theory through numerical simulations in a demand-response setting and a queueing setting.
comment: 34 pages, 6 figures
Aligning Individual and Collective Objectives in Multi-Agent Cooperation NeurIPS 2024
Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals. The cutting-edge research is focused on incorporating domain knowledge into rewards and introducing additional mechanisms to incentivize cooperation. However, these approaches often face shortcomings such as the effort on manual design and the absence of theoretical groundings. To close this gap, we model the mixed-motive game as a differentiable game for the ease of illuminating the learning dynamics towards cooperation. More detailed, we introduce a novel optimization method named \textbf{\textit{A}}ltruistic \textbf{\textit{G}}radient \textbf{\textit{A}}djustment (\textbf{\textit{AgA}}) that employs gradient adjustments to progressively align individual and collective objectives. Furthermore, we theoretically prove that AgA effectively attracts gradients to stable fixed points of the collective objective while considering individual interests, and we validate these claims with empirical evidence. We evaluate the effectiveness of our algorithm AgA through benchmark environments for testing mixed-motive collaboration with small-scale agents such as the two-player public good game and the sequential social dilemma games, Cleanup and Harvest, as well as our self-developed large-scale environment in the game StarCraft II.
comment: 20 pages; Accepted by NeurIPS 2024
Systems and Control (CS)
Scalable spectral representations for network multiagent control
Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent. Building on these local spectral representations, we design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. Empirically, we validate the effectiveness of our scalable representation-based approach on two benchmark problems, and demonstrate the advantages of our approach over generic function approximation approaches to representing the local $Q$-functions.
Hierarchical Upper Confidence Bounds for Constrained Online Learning
The multi-armed bandit (MAB) problem is a foundational framework in sequential decision-making under uncertainty, extensively studied for its applications in areas such as clinical trials, online advertising, and resource allocation. Traditional MAB formulations, however, do not adequately capture scenarios where decisions are structured hierarchically, involve multi-level constraints, or feature context-dependent action spaces. In this paper, we introduce the hierarchical constrained bandits (HCB) framework, which extends the contextual bandit problem to incorporate hierarchical decision structures and multi-level constraints. We propose the hierarchical constrained upper confidence bound (HC-UCB) algorithm, designed to address the complexities of the HCB problem by leveraging confidence bounds within a hierarchical setting. Our theoretical analysis establishes sublinear regret bounds for HC-UCB and provides high-probability guarantees for constraint satisfaction at all hierarchical levels. Furthermore, we derive a minimax lower bound on the regret for the HCB problem, demonstrating the near-optimality of our algorithm. The results are significant for real-world applications where decision-making processes are inherently hierarchical and constrained, offering a robust and efficient solution that balances exploration and exploitation across multiple levels of decision-making.
Risk-Averse Model Predictive Control for Racing in Adverse Conditions
Model predictive control (MPC) algorithms can be sensitive to model mismatch when used in challenging nonlinear control tasks. In particular, the performance of MPC for vehicle control at the limits of handling suffers when the underlying model overestimates the vehicle's capabilities. In this work, we propose a risk-averse MPC framework that explicitly accounts for uncertainty over friction limits and tire parameters. Our approach leverages a sample-based approximation of an optimal control problem with a conditional value at risk (CVaR) constraint. This sample-based formulation enables planning with a set of expressive vehicle dynamics models using different tire parameters. Moreover, this formulation enables efficient numerical resolution via sequential quadratic programming and GPU parallelization. Experiments on a Lexus LC 500 show that risk-averse MPC unlocks reliable performance, while a deterministic baseline that plans using a single dynamics model may lose control of the vehicle in adverse road conditions.
Empowering the Grid: Decentralized Autonomous Control for Effective Utilization and Resilience
With the emergence of low-inertia microgrids powered by inverter-based generation, there remains a concern about the operational resilience of these systems. Grid-forming inverters (GFMs), enabled by various device-level (primary) and system-level (secondary) control methods, are poised to play a significant role in achieving certain operational objectives, such as the effective utilization of clean energy resources while maintaining stability. However, despite the recent advances in GFMs, there is a lack of suitable controls that can ascertain resilience-constrained operations, like maintaining critical operational safety limits during transients under various cyber-physical disruptions. In this work, we develop decentralized autonomous controllers (DACs) that enforce resilience-constrained operation via local, minimally invasive adjustments (e.g., changes in set-points) while co-existing within the hierarchy of existing (primary and secondary) controls. The DACs work autonomously by sensing only local GFM measurements and act only when operational resilience constraints are violated. The proposed DAC scheme is computationally efficient (only algebraic computations), which enables fast, real-time execution and demonstrates the efficacy of the proposed control framework on GridLAB-D-HELICS-based control-grid co-simulations on the IEEE 123-node networked microgrid. Finally, we show how the developed DACs empower the grid by utilizing the available resources entirely to ensure resilience (maintain frequency safe limits).
comment: This paper is currently under review in a journal
Learning Load Balancing with GNN in MPTCP-Enabled Heterogeneous Networks
Hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks are a promising paradigm of heterogeneous network (HetNet), attributed to the complementary physical properties of optical spectra and radio frequency. However, the current development of such HetNets is mostly bottlenecked by the existing transmission control protocol (TCP), which restricts the user equipment (UE) to connecting one access point (AP) at a time. While the ongoing investigation on multipath TCP (MPTCP) can bring significant benefits, it complicates the network topology of HetNets, making the existing load balancing (LB) learning models less effective. Driven by this, we propose a graph neural network (GNN)-based model to tackle the LB problem for MPTCP-enabled HetNets, which results in a partial mesh topology. Such a topology can be modeled as a graph, with the channel state information and data rate requirement embedded as node features, while the LB solutions are deemed as edge labels. Compared to the conventional deep neural network (DNN), the proposed GNN-based model exhibits two key strengths: i) it can better interpret a complex network topology; and ii) it can handle various numbers of APs and UEs with a single trained model. Simulation results show that against the traditional optimisation method, the proposed learning model can achieve near-optimal throughput within a gap of 11.5%, while reducing the inference time by 4 orders of magnitude. In contrast to the DNN model, the new method can improve the network throughput by up to 21.7%, at a similar inference time level.
A Hybrid Simulation of DNN-based Gray Box Models
Simulation is vital for engineering disciplines, as it enables the prediction and design of physical systems. However, the computational challenges inherent to large-scale simulations often arise from complex device models featuring high degrees of nonlinearities or hidden physical behaviors not captured by first principles. Gray-box models combine deep neural networks (DNNs) with physics-based models to address the computational challenges in modeling physical systems. A well-crafted gray box model capitalizes on the interpretability and accuracy of a physical model while incorporating DNNs to capture hidden physical behaviors and mitigate computational load associated with highly nonlinear components. Previously, gray box models have been constructed by defining an explicit combination of physics-based and DNN models to represent the behavior of sub-systems; however this cannot represent the coupled interactions within physical systems. We explore an implicit gray box model, where both DNNs and physical equations share a common set of state-variables. While this approach captures coupled interactions at the boundary of DNN and physics-based models, simulating the implicit gray box model remains an open-ended problem. In this work, we introduce a new hybrid simulation that integrates DNNs into the numerical solvers of simulation engines to fully simulate implicit gray box models of large physical systems. This is accomplished by backpropagating through the DNN to calculate Jacobian values during each iteration of the numerical method. The hybrid simulation improves the accuracy and runtime compared to physics-based simulation and enables reusable DNN models with lower data requirements. We explore the advantages of this approach as compared to physics-based, black box, and other gray box methods for simulating the steady-state and transient behavior of power systems.
Optimal gait design for nonlinear soft robotic crawlers
Soft robots offer a frontier in robotics with enormous potential for safe human-robot interaction and agility in uncertain environments. A steppingstone towards unlocking the potential of soft robotics is a tailored control theory, including a principled framework for gait design. We analyze the problem of optimal gait design for a soft crawling body, "the crawler". The crawler is an elastic body with the control signal defined as actuation forces between segments of the body. We consider the simplest such crawler: a two-segmented body with a passive mechanical connection modeling the viscoelastic body dynamics and a symmetric control force modeling actuation between the two body segments. The model accounts for the nonlinear asymmetric friction with the ground, which together with the symmetric actuation forces enable the crawler's locomotion. Using a describing-function analysis, we show that when the body is forced sinusoidally, the optimal actuator contraction frequency corresponds to the body's natural frequency when operating with only passive dynamics. We then use the framework of Optimal Periodic Control (OPC) to design optimal force cycles of arbitrary waveform and the corresponding crawling gaits. We provide a hill-climbing algorithm to solve the OPC problem numerically. Our proposed methods and results inform the design of optimal forcing and gaits for more complex and multi-segmented crawling bodies.
A Comparison of Baseline Models and a Transformer Network for SOC Prediction in Lithium-Ion Batteries
Accurately predicting the state of charge of Lithium-ion batteries is essential to the performance of battery management systems of electric vehicles. One of the main reasons for the slow global adoption of electric cars is driving range anxiety. The ability of a battery management system to accurately estimate the state of charge can help alleviate this problem. In this paper, a comparison between data-driven state-of-charge estimation methods is conducted. The paper compares different neural network-based models and common regression models for SOC estimation. These models include several ablated transformer networks, a neural network, a lasso regression model, a linear regression model and a decision tree. Results of various experiments conducted on data obtained from natural driving cycles of the BMW i3 battery show that the decision tree outperformed all other models including the more complex transformer network with self-attention and positional encoding.
On Optimal Battery Sizing for Electric Vehicles
In this paper, we introduce a quantitative framework to optimize electric vehicle (EV) battery capacities, considering two criteria: upfront vehicle cost and charging inconvenience cost. For this purpose, we (1) develop a comprehensive model for charging inconvenience costs, incorporating both charging time and detours, improving on existing studies, (2) show, through extensive simulations and analytical models, how charging inconvenience cost is affected by different battery capacity and charging infrastructure configurations, (3) introduce an optimisation framework to determine optimal battery capacities based on charging inconvenience and vehicle cost, and (4) show that optimal battery capacities can be influenced by strategic investments in charging infrastructure and tax/incentive policies. The proposed framework provides actionable insights into the sustainable design of EV systems, supporting the development of cost-effective and convenient electric mobility solutions.
Electrode SOC and SOH estimation with electrode-level ECMs
Being able to predict battery internal states that are related to battery degradation is a key aspect to improve battery lifetime and performance, enhancing cleaner electric transportation and energy generation. However, most present battery management systems (BMSs) use equivalent-circuit models (ECMs) for state of charge (SOC) and state of health (SOH) estimation. These models are not able to predict these aging-related variables, and therefore, they cannot be used to limit battery degradation. In this paper, we propose a method for electrode-level SOC (eSOC) and electrode-level SOH (eSOH) estimation using an electrode-level ECM (eECM). The method can produce estimates of the states of lithiation (SOL) of both electrodes and update the eSOH parameters to maintain estimation accuracy through the lifetime of the battery. Furthermore, the eSOH parameter estimates are used to obtain degradation mode information, which could be used to improve state estimation, health diagnosis and prognosis. The method was validated in simulation and experimentally.
Iterative Cut-Based PWA Approximation of Multi-Dimensional Nonlinear Systems
PieceWise Affine (PWA) approximations for nonlinear functions have been extensively used for tractable, computationally efficient control of nonlinear systems. However, reaching a desired approximation accuracy without prior information about the behavior of the nonlinear systems remains a challenge in the function approximation and control literature. As the name suggests, PWA approximation aims at approximating a nonlinear function or system by dividing the domain into multiple subregions where the nonlinear function or dynamics is approximated locally by an affine function also called local mode. Without prior knowledge of the form of the nonlinearity, the required number of modes, the locations of the subregions, and the local approximations need to be optimized simultaneously, which becomes highly complex for large-scale systems with multi-dimensional nonlinear functions. This paper introduces a novel approach for PWA approximation of multi-dimensional nonlinear systems, utilizing a hinging hyperplane formalism for cut-based partitioning of the domain. The complexity of the PWA approximation is iteratively increased until reaching the desired accuracy level. Further, the tractable cut definitions allow for different forms of subregions, as well as the ability to impose continuity constraints on the PWA approximation. The methodology is explained via multiple examples and its performance is compared to two existing approaches through case studies, showcasing its efficacy.
comment: 9 pages, 4 figures, submitted to journal
Cooperative Trajectory Planning: Principles for Human-Machine System Design on Trajectory Level
This paper explores cooperative trajectory planning approaches within the context of human-machine shared control. In shared control research, it is typically assumed that the human and the automation use the same reference trajectory to stabilize the coupled system. However, this assumption is often incorrect, as they usually follow different trajectories, causing control conflicts at the action level that have not been widely researched. To address this, it is logical to extend shared control concepts to include human-machine interaction at the trajectory-level before action execution, resulting in a unified reference trajectory for both human and automation. This paper begins with a literature overview on approaches of cooperative trajectory planning. It then presents an approach of finding a joint trajectory by modelling cooperative trajectory planning as an agreement process. A generally valid system structure is proposed for this purpose. Finally, it proposes concepts to implement cooperative trajectory planning as an agreement process.
Nature-inspired dynamic control for pursuit-evasion of robots
The pursuit-evasion problem is widespread in nature, engineering and societal applications. It is commonly observed in nature that a predator runs faster than its prey but it has less agile maneuverability. Over millions of years of evolution, animals have developed effective and efficient pursuit and evasion strategies. In this paper, we provide a dynamic framework for pursuit-evasion of unicycle systems from a nature-inspired perspective. Firstly, for the problem with one pursuer and one evader, we propose an Alert-Turn control strategy which consists of two efficient ingredients: the suddenly turning maneuver and the alert condition for starting and maintaining the maneuver. We present and analyze the escape and capture results at a lower level of a single run and at a higher level with respect to parameters' changes. A theorem with sufficient condition for capture is also given. Then, the Alert-Turn strategy is combined with aggregation control laws and a target-changing mechanism to model more complex phenomenons with multiple pursuers and evaders. By adjusting a selfish parameter, the aggregation control commands can achieve different escape patterns of evaders: cooperative mode, selfish mode, as well as their combinations, and the influence of the selfish parameter is quantified. We present the effects of the number of pursuers and the target-changing mechanism from a statistical perspective. Our findings are largely in line with observations in nature. Furthermore, our control strategies are verified by numerical simulations that replicate some chasing behaviors of animals in nature.
comment: 15 pages
Guiding Reinforcement Learning with Incomplete System Dynamics
Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness.
Fast State-of-Health Estimation Method for Lithium-ion Battery using Sparse Identification of Nonlinear Dynamics
Lithium-ion batteries (LIBs) are utilized as a major energy source in various fields because of their high energy density and long lifespan. During repeated charging and discharging, the degradation of LIBs, which reduces their maximum power output and operating time, is a pivotal issue. This degradation can affect not only battery performance but also safety of the system. Therefore, it is essential to accurately estimate the state-of-health (SOH) of the battery in real time. To address this problem, we propose a fast SOH estimation method that utilizes the sparse model identification algorithm (SINDy) for nonlinear dynamics. SINDy can discover the governing equations of target systems with low data assuming that few functions have the dominant characteristic of the system. To decide the state of degradation model, correlation analysis is suggested. Using SINDy and correlation analysis, we can obtain the data-driven SOH model to improve the interpretability of the system. To validate the feasibility of the proposed method, the estimation performance of the SOH and the computation time are evaluated by comparing it with various machine learning algorithms.
Global Stability Notions to Enhance the Rigor and Robustness of Adaptive Control
Stability theory plays a crucial role in feedback control. However, adaptive control theory requires advanced and specialized stability notions that are not frequently used in standard feedback control theory. The present document is a set of notes for a graduate course. It describes the global stability notions needed in (robust) adaptive control and develops the mathematical tools that are used for the proof of such stability properties. Moreover, the document shows why and how these global stability properties arise in adaptive control. We focus on stability properties for time-invariant systems. Consequently, tracking control problems are not covered by the present document.
comment: 48 pages, 4 figures
FastGEMF: Scalable High-Speed Simulation of Stochastic Spreading Processes over Complex Multilayer Networks
Predicting the spread of processes across complex multi-layered networks has long challenged researchers due to the intricate interplay between network structure and propagation dynamics. Each layer of these networks possesses unique characteristics, further complicating analysis. To authors' knowledge, a comprehensive framework capable of simulating various spreading processes across different layers, particularly in networks with millions of nodes and connections, has been notably absent. This study introduces a novel framework that efficiently predicts Markov Chain processes over large-scale networks, while significantly reducing time and space complexity. This approach enables exact simulation of spreading processes across extensive real-world multi-layer networks, accounting for diverse influencers on each layer. FastGEMF provides a baseline framework for exact simulating stochastic spread processes, facilitating comparative analysis of models across diverse domains, from epidemiology to social media dynamics.
Graph Neural Network-Accelerated Network-Reconfigured Optimal Power Flow
Optimal power flow (OPF) has been used for real-time grid operations. Prior efforts demonstrated that utilizing flexibility from dynamic topologies will improve grid efficiency. However, this will convert the linear OPF into a mixed-integer linear programming network-reconfigured OPF (NR-OPF) problem, substantially increasing the computing time. Thus, a machine learning (ML)-based approach, particularly utilizing graph neural network (GNN), is proposed to accelerate the solution process. The GNN model is trained offline to predict the best topology before entering the optimization stage. In addition, this paper proposes an offline pre-ML filter layer to reduce GNN model size and training time while improving its accuracy. A fast online post-ML selection layer is also proposed to analyze GNN predictions and then select a subset of predicted NR solutions with high confidence. Case studies have demonstrated superior performance of the proposed GNN-accelerated NR-OPF method augmented with the proposed pre-ML and post-ML layers.
AI-focused HPC Data Centers Can Provide More Power Grid Flexibility and at Lower Cost
The recent growth of Artificial Intelligence (AI), particularly large language models, requires energy-demanding high-performance computing (HPC) data centers, which poses a significant burden on power system capacity. Scheduling data center computing jobs to manage power demand can alleviate network stress with minimal infrastructure investment and contribute to fast time-scale power system balancing. This study, for the first time, comprehensively analyzes the capability and cost of grid flexibility provision by GPU-heavy AI-focused HPC data centers, along with a comparison with CPU-heavy general-purpose HPC data centers traditionally used for scientific computing. Using real-world data from 7 AI-focused HPC data centers, 7 general-purpose HPC data centers, and 3 cloud platforms, we find that AI-focused HPC data centers can offer greater flexibility at 50% lower cost for a range of power system services. By comparing the cost to flexibility market prices, we illustrate the financial profitability of flexibility provision for AI-focused HPC data centers.
comment: 22 pages (including supplementary materials and references), under review for Joule
Geometric Graph Neural Network Modeling of Human Interactions in Crowded Environments
Modeling human trajectories in crowded environments is challenging due to the complex nature of pedestrian behavior and interactions. This paper proposes a geometric graph neural network (GNN) architecture that integrates domain knowledge from psychological studies to model pedestrian interactions and predict future trajectories. Unlike prior studies using complete graphs, we define interaction neighborhoods using pedestrians' field of view, motion direction, and distance-based kernel functions to construct graph representations of crowds. Evaluations across multiple datasets demonstrate improved prediction accuracy through reduced average and final displacement error metrics. Our findings underscore the importance of integrating domain knowledge with data-driven approaches for effective modeling of human interactions in crowds.
comment: \c{opyright} 2024 the authors. This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-ND
Invisible Manipulation Deep Reinforcement Learning Enhanced Stealthy Attacks on Battery Energy Management Systems
This paper introduces "invisible manipulation", an innovative cyber-attack mechanism achieved through strategically timed stealthy false data injection attacks (SFDIAs). By stealthily manipulating measurements of a critical asset prior to the target time period, the attacker can subtly guide the engineering system toward a predetermined operational state without detection. Using the battery energy management system (BEMS) as a case study, we employ deep reinforcement learning (DRL) to generate synthetic measurements, such as battery voltage and current, that align closely with actual measurements. These synthetic measurements, falling within the acceptable error margin of residual-based bad data detection algorithm provided by state estimation, can evade detection and mislead Extended Kalman-filter-based State of Charge estimation. Subsequently, considering the deceptive data as valid inputs, the BEMS will operate the BESS towards the attacker desired operational states when the targeted time period come. The use of the DRL-based scheme allows us to covert an online optimization problem into an offline training process, thereby alleviating the computational burden for real-time implementation. Comprehensive testing on a high-fidelity microgrid real-time simulation testbed validates the effectiveness and adaptability of the proposed methods in achieving different attack objectives.
Preserving Privacy in Cloud-based Data-Driven Stabilization
In the recent years, we have observed three significant trends in control systems: a renewed interest in data-driven control design, the abundance of cloud computational services and the importance of preserving privacy for the system under control. Motivated by these factors, this work investigates privacy-preserving outsourcing for the design of a stabilizing controller for unknown linear time-invariant systems.The main objective of this research is to preserve the privacy for the system dynamics by designing an outsourcing mechanism. To achieve this goal, we propose a scheme that combines transformation-based techniques and robust data-driven control design methods. The scheme preserves the privacy of both the open-loop and closed-loop system matrices while stabilizing the system under control.The scheme is applicable to both data with and without disturbance and is lightweight in terms of computational overhead. Numerical investigations for a case study demonstrate the impacts of our mechanism and its role in hindering malicious adversaries from achieving their goals.
User Experience Evaluation of AR Assisted Industrial Maintenance and Support Applications
The paper introduces an innovative approach to industrial maintenance leveraging augmented reality (AR) technology, focusing on enhancing the user experience and efficiency. The shift from traditional to proactive maintenance strategies underscores the significance of maintenance in industrial systems. The proposed solution integrates AR interfaces, particularly through Head-Mounted Display (HMD) devices, to provide expert personnel-aided decision support for maintenance technicians, with the association of Artificial Intelligence (AI) solutions. The study explores the user experience aspect of AR interfaces in a simulated industrial environment, aiming to improve the maintenance processes' intuitiveness and effectiveness. Evaluation metrics such as the NASA Task Load Index (NASA-TLX) and the System Usability Scale (SUS) are employed to assess the usability, performance, and workload implications of the AR maintenance system. Additionally, the paper discusses the technical implementation, methodology, and results of experiments conducted to evaluate the effectiveness of the proposed solution.
Heuristic Search for Linear Positive Systems
This work considers infinite-horizon optimal control of positive linear systems applied to the case of network routing problems. We demonstrate the equivalence between Stochastic Shortest Path (SSP) problems and optimal control of a certain class of linear systems. This is used to construct a heuristic search framework for this class of linear systems inspired by existing methods for SSP. We propose a heuristics-based algorithm for finding local solutions to the analyzed class of optimal control problems with positive state and linear dynamics. More fundamentally, the results allow for analysis of the conditions for explicit solutions to the Bellman equation utilized by heuristic search methods.
comment: Preprint submitted to Automatica for review
Hierarchical Deep Learning Model for Degradation Prediction per Look-Ahead Scheduled Battery Usage Profile
Batteries can effectively improve the security of energy systems and mitigate climate change by facilitating wind and solar power. The installed capacity of battery energy storage system (BESS), mainly the lithium ion batteries are increasing significantly in recent years. However, the battery degradation cannot be accurately quantified and integrated into energy management system with existing heuristic battery degradation models. This paper proposed a hierarchical deep learning based battery degradation quantification (HDL-BDQ) model to quantify the battery degradation given scheduled BESS daily operations. Particularly, two sequential and cohesive deep neural networks are proposed to accurately estimate the degree of degradation using inputs of battery operational profiles and it can significantly outperform existing fixed or linear rate based degradation models as well as single-stage deep neural models. Training results show the high accuracy of the proposed system. Moreover, a learning and optimization decoupled algorithm is implemented to strategically take advantage of the proposed HDL-BDQ model in optimization-based look-ahead scheduling (LAS) problems. Case studies demonstrate the effectiveness of the proposed HDL-BDQ model in LAS of a microgrid testbed.
comment: 12 pages
Nonlinear Magnetics Model for Permanent Magnet Synchronous Machines Capturing Saturation and Temperature Effects
This paper proposes a nonlinear magnetics model for Permanent Magnet Synchronous Machines (PMSMs) that accurately captures the effects of magnetic saturation in the machine iron and variations in rotor temperature on the permanent magnet excitation. The proposed model considers the permanent magnet as a current source rather than the more commonly used flux-linkage source. A comparison of the two modelling approaches is conducted using Finite Element Analysis (FEA) for different machine designs as well as experimental validation, where it is shown that the proposed model has substantially better accuracy. The proposed model decouples magnetic saturation and rotor temperature effects in the current/flux-linkage relationship, allowing for adaptive estimation of the PM excitation.
k-Dimensional Agreement in Multiagent Systems
Given a network of agents, we study the problem of designing a distributed algorithm that computes k independent weighted means of the network's initial conditions (namely, the agents agree on a k-dimensional space). Akin to average consensus, this problem finds applications in distributed computing and sensing, where agents seek to simultaneously evaluate k independent functions at a common point by running a single coordination algorithm. We show that linear algorithms can agree on quantities that are oblique projections of the vector of initial conditions, and we provide techniques to design protocols that are compatible with a pre-specified communication graph. More broadly, our results show that a single agreement algorithm can solve $k$ consensus problems simultaneously at a fraction of the complexity of classical approaches but, in general, it requires higher network connectivity.
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC ICRA
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses.
comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
comment: 6 pages, 8 figures
IoT-Based Water Quality Monitoring System in Philippine Off-Grid Communities
Contaminated and polluted water poses significant threats to human health, necessitating vigilant monitoring of water sources for potential contamination. This paper introduces a low-cost Internet of Things (IoT)-based water quality monitoring system designed to address water quality challenges in rural communities, as demonstrated through a case study conducted in the Philippines. The system consists of two core components. The hardware component of the system, built on Arduino technology and featuring real-time data transmission, focuses on monitoring pH levels, turbidity, and temperature via sensors. The system is equipped to transmit data to a cloud database and send informative messages to mobile numbers, updating users on the status of water supplies. The application component acts as a user interface for accessing and managing data collected by the sensors. The successful deployment of this Water Quality Monitoring (WQM) system not only helps community leaders and health workers monitor water sources but also underscores its potential to empower communities in safeguarding their water sources, thereby contributing to the advancement of clean and safe water access.
comment: Proceedings of the 2024 9th International Conference on Business and Industrial Research, May 2024, Bangkok, Thailand
Efficient pseudometrics for data-driven comparisons of nonlinear dynamical systems
Computationally efficient solutions for pseudometrics quantifying deviation from topological conjugacy between dynamical systems are presented. Deviation from conjugacy is quantified in a Pareto optimal sense that accounts for spectral properties of Koopman operators as well as trajectory geometry. Theoretical justification is provided for computing such pseudometrics in Koopman eigenfunction space rather than observable space. Furthermore, it is shown that theoretical consistency with topological conjugacy can be maintained when restricting the search for optimal transformations between systems to the unitary group. Therefore the pseudometrics are based on analytical solutions for unitary transformations in Koopman eigenfunction space. Geometric considerations for the deviation from conjugacy Pareto optimality problem are used to develop scalar pseudometrics that account for all possible optimal solutions given just two Pareto points. The approach is demonstrated on two example problems; the first being a simple benchmarking problem and the second an engineering example comparing the dynamics of morphological computation of biological nonlinear muscle actuators to simplified mad-made (including bioinspired) approaches. The benefits of considering operator and trajectory geometry based dissimilarity measures in a unified and consistent formalism is demonstrated. Overall, the deviation from conjugacy pseudometrics provide practical advantages in terms of efficiency and scalability, while maintaining theoretical consistency.
comment: minor edits
Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control
Empowering resource-limited robots to execute computationally intensive tasks such as locomotion and manipulation is challenging. This project provides a comprehensive design space exploration to determine optimal hardware computation architectures suitable for model-based control algorithms. We profile and optimize representative architectural designs across general-purpose scalar, vector processors, and specialized accelerators. Specifically, we compare CPUs, vector machines, and domain-specialized accelerators with kernel-level benchmarks and end-to-end representative robotic workloads. Our exploration provides a quantitative performance, area, and utilization comparison and analyzes the trade-offs between these representative distinct architectural designs. We demonstrate that architectural modifications, software, and system optimization can alleviate bottlenecks and enhance utilization. Finally, we propose a code generation flow to simplify the engineering work for mapping robotic workloads to specialized architectures.
Systems and Control (EESS)
Scalable spectral representations for network multiagent control
Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent. Building on these local spectral representations, we design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. Empirically, we validate the effectiveness of our scalable representation-based approach on two benchmark problems, and demonstrate the advantages of our approach over generic function approximation approaches to representing the local $Q$-functions.
Hierarchical Upper Confidence Bounds for Constrained Online Learning
The multi-armed bandit (MAB) problem is a foundational framework in sequential decision-making under uncertainty, extensively studied for its applications in areas such as clinical trials, online advertising, and resource allocation. Traditional MAB formulations, however, do not adequately capture scenarios where decisions are structured hierarchically, involve multi-level constraints, or feature context-dependent action spaces. In this paper, we introduce the hierarchical constrained bandits (HCB) framework, which extends the contextual bandit problem to incorporate hierarchical decision structures and multi-level constraints. We propose the hierarchical constrained upper confidence bound (HC-UCB) algorithm, designed to address the complexities of the HCB problem by leveraging confidence bounds within a hierarchical setting. Our theoretical analysis establishes sublinear regret bounds for HC-UCB and provides high-probability guarantees for constraint satisfaction at all hierarchical levels. Furthermore, we derive a minimax lower bound on the regret for the HCB problem, demonstrating the near-optimality of our algorithm. The results are significant for real-world applications where decision-making processes are inherently hierarchical and constrained, offering a robust and efficient solution that balances exploration and exploitation across multiple levels of decision-making.
Risk-Averse Model Predictive Control for Racing in Adverse Conditions
Model predictive control (MPC) algorithms can be sensitive to model mismatch when used in challenging nonlinear control tasks. In particular, the performance of MPC for vehicle control at the limits of handling suffers when the underlying model overestimates the vehicle's capabilities. In this work, we propose a risk-averse MPC framework that explicitly accounts for uncertainty over friction limits and tire parameters. Our approach leverages a sample-based approximation of an optimal control problem with a conditional value at risk (CVaR) constraint. This sample-based formulation enables planning with a set of expressive vehicle dynamics models using different tire parameters. Moreover, this formulation enables efficient numerical resolution via sequential quadratic programming and GPU parallelization. Experiments on a Lexus LC 500 show that risk-averse MPC unlocks reliable performance, while a deterministic baseline that plans using a single dynamics model may lose control of the vehicle in adverse road conditions.
Empowering the Grid: Decentralized Autonomous Control for Effective Utilization and Resilience
With the emergence of low-inertia microgrids powered by inverter-based generation, there remains a concern about the operational resilience of these systems. Grid-forming inverters (GFMs), enabled by various device-level (primary) and system-level (secondary) control methods, are poised to play a significant role in achieving certain operational objectives, such as the effective utilization of clean energy resources while maintaining stability. However, despite the recent advances in GFMs, there is a lack of suitable controls that can ascertain resilience-constrained operations, like maintaining critical operational safety limits during transients under various cyber-physical disruptions. In this work, we develop decentralized autonomous controllers (DACs) that enforce resilience-constrained operation via local, minimally invasive adjustments (e.g., changes in set-points) while co-existing within the hierarchy of existing (primary and secondary) controls. The DACs work autonomously by sensing only local GFM measurements and act only when operational resilience constraints are violated. The proposed DAC scheme is computationally efficient (only algebraic computations), which enables fast, real-time execution and demonstrates the efficacy of the proposed control framework on GridLAB-D-HELICS-based control-grid co-simulations on the IEEE 123-node networked microgrid. Finally, we show how the developed DACs empower the grid by utilizing the available resources entirely to ensure resilience (maintain frequency safe limits).
comment: This paper is currently under review in a journal
Learning Load Balancing with GNN in MPTCP-Enabled Heterogeneous Networks
Hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks are a promising paradigm of heterogeneous network (HetNet), attributed to the complementary physical properties of optical spectra and radio frequency. However, the current development of such HetNets is mostly bottlenecked by the existing transmission control protocol (TCP), which restricts the user equipment (UE) to connecting one access point (AP) at a time. While the ongoing investigation on multipath TCP (MPTCP) can bring significant benefits, it complicates the network topology of HetNets, making the existing load balancing (LB) learning models less effective. Driven by this, we propose a graph neural network (GNN)-based model to tackle the LB problem for MPTCP-enabled HetNets, which results in a partial mesh topology. Such a topology can be modeled as a graph, with the channel state information and data rate requirement embedded as node features, while the LB solutions are deemed as edge labels. Compared to the conventional deep neural network (DNN), the proposed GNN-based model exhibits two key strengths: i) it can better interpret a complex network topology; and ii) it can handle various numbers of APs and UEs with a single trained model. Simulation results show that against the traditional optimisation method, the proposed learning model can achieve near-optimal throughput within a gap of 11.5%, while reducing the inference time by 4 orders of magnitude. In contrast to the DNN model, the new method can improve the network throughput by up to 21.7%, at a similar inference time level.
A Hybrid Simulation of DNN-based Gray Box Models
Simulation is vital for engineering disciplines, as it enables the prediction and design of physical systems. However, the computational challenges inherent to large-scale simulations often arise from complex device models featuring high degrees of nonlinearities or hidden physical behaviors not captured by first principles. Gray-box models combine deep neural networks (DNNs) with physics-based models to address the computational challenges in modeling physical systems. A well-crafted gray box model capitalizes on the interpretability and accuracy of a physical model while incorporating DNNs to capture hidden physical behaviors and mitigate computational load associated with highly nonlinear components. Previously, gray box models have been constructed by defining an explicit combination of physics-based and DNN models to represent the behavior of sub-systems; however this cannot represent the coupled interactions within physical systems. We explore an implicit gray box model, where both DNNs and physical equations share a common set of state-variables. While this approach captures coupled interactions at the boundary of DNN and physics-based models, simulating the implicit gray box model remains an open-ended problem. In this work, we introduce a new hybrid simulation that integrates DNNs into the numerical solvers of simulation engines to fully simulate implicit gray box models of large physical systems. This is accomplished by backpropagating through the DNN to calculate Jacobian values during each iteration of the numerical method. The hybrid simulation improves the accuracy and runtime compared to physics-based simulation and enables reusable DNN models with lower data requirements. We explore the advantages of this approach as compared to physics-based, black box, and other gray box methods for simulating the steady-state and transient behavior of power systems.
Optimal gait design for nonlinear soft robotic crawlers
Soft robots offer a frontier in robotics with enormous potential for safe human-robot interaction and agility in uncertain environments. A steppingstone towards unlocking the potential of soft robotics is a tailored control theory, including a principled framework for gait design. We analyze the problem of optimal gait design for a soft crawling body, "the crawler". The crawler is an elastic body with the control signal defined as actuation forces between segments of the body. We consider the simplest such crawler: a two-segmented body with a passive mechanical connection modeling the viscoelastic body dynamics and a symmetric control force modeling actuation between the two body segments. The model accounts for the nonlinear asymmetric friction with the ground, which together with the symmetric actuation forces enable the crawler's locomotion. Using a describing-function analysis, we show that when the body is forced sinusoidally, the optimal actuator contraction frequency corresponds to the body's natural frequency when operating with only passive dynamics. We then use the framework of Optimal Periodic Control (OPC) to design optimal force cycles of arbitrary waveform and the corresponding crawling gaits. We provide a hill-climbing algorithm to solve the OPC problem numerically. Our proposed methods and results inform the design of optimal forcing and gaits for more complex and multi-segmented crawling bodies.
A Comparison of Baseline Models and a Transformer Network for SOC Prediction in Lithium-Ion Batteries
Accurately predicting the state of charge of Lithium-ion batteries is essential to the performance of battery management systems of electric vehicles. One of the main reasons for the slow global adoption of electric cars is driving range anxiety. The ability of a battery management system to accurately estimate the state of charge can help alleviate this problem. In this paper, a comparison between data-driven state-of-charge estimation methods is conducted. The paper compares different neural network-based models and common regression models for SOC estimation. These models include several ablated transformer networks, a neural network, a lasso regression model, a linear regression model and a decision tree. Results of various experiments conducted on data obtained from natural driving cycles of the BMW i3 battery show that the decision tree outperformed all other models including the more complex transformer network with self-attention and positional encoding.
On Optimal Battery Sizing for Electric Vehicles
In this paper, we introduce a quantitative framework to optimize electric vehicle (EV) battery capacities, considering two criteria: upfront vehicle cost and charging inconvenience cost. For this purpose, we (1) develop a comprehensive model for charging inconvenience costs, incorporating both charging time and detours, improving on existing studies, (2) show, through extensive simulations and analytical models, how charging inconvenience cost is affected by different battery capacity and charging infrastructure configurations, (3) introduce an optimisation framework to determine optimal battery capacities based on charging inconvenience and vehicle cost, and (4) show that optimal battery capacities can be influenced by strategic investments in charging infrastructure and tax/incentive policies. The proposed framework provides actionable insights into the sustainable design of EV systems, supporting the development of cost-effective and convenient electric mobility solutions.
Electrode SOC and SOH estimation with electrode-level ECMs
Being able to predict battery internal states that are related to battery degradation is a key aspect to improve battery lifetime and performance, enhancing cleaner electric transportation and energy generation. However, most present battery management systems (BMSs) use equivalent-circuit models (ECMs) for state of charge (SOC) and state of health (SOH) estimation. These models are not able to predict these aging-related variables, and therefore, they cannot be used to limit battery degradation. In this paper, we propose a method for electrode-level SOC (eSOC) and electrode-level SOH (eSOH) estimation using an electrode-level ECM (eECM). The method can produce estimates of the states of lithiation (SOL) of both electrodes and update the eSOH parameters to maintain estimation accuracy through the lifetime of the battery. Furthermore, the eSOH parameter estimates are used to obtain degradation mode information, which could be used to improve state estimation, health diagnosis and prognosis. The method was validated in simulation and experimentally.
Iterative Cut-Based PWA Approximation of Multi-Dimensional Nonlinear Systems
PieceWise Affine (PWA) approximations for nonlinear functions have been extensively used for tractable, computationally efficient control of nonlinear systems. However, reaching a desired approximation accuracy without prior information about the behavior of the nonlinear systems remains a challenge in the function approximation and control literature. As the name suggests, PWA approximation aims at approximating a nonlinear function or system by dividing the domain into multiple subregions where the nonlinear function or dynamics is approximated locally by an affine function also called local mode. Without prior knowledge of the form of the nonlinearity, the required number of modes, the locations of the subregions, and the local approximations need to be optimized simultaneously, which becomes highly complex for large-scale systems with multi-dimensional nonlinear functions. This paper introduces a novel approach for PWA approximation of multi-dimensional nonlinear systems, utilizing a hinging hyperplane formalism for cut-based partitioning of the domain. The complexity of the PWA approximation is iteratively increased until reaching the desired accuracy level. Further, the tractable cut definitions allow for different forms of subregions, as well as the ability to impose continuity constraints on the PWA approximation. The methodology is explained via multiple examples and its performance is compared to two existing approaches through case studies, showcasing its efficacy.
comment: 9 pages, 4 figures, submitted to journal
Cooperative Trajectory Planning: Principles for Human-Machine System Design on Trajectory Level
This paper explores cooperative trajectory planning approaches within the context of human-machine shared control. In shared control research, it is typically assumed that the human and the automation use the same reference trajectory to stabilize the coupled system. However, this assumption is often incorrect, as they usually follow different trajectories, causing control conflicts at the action level that have not been widely researched. To address this, it is logical to extend shared control concepts to include human-machine interaction at the trajectory-level before action execution, resulting in a unified reference trajectory for both human and automation. This paper begins with a literature overview on approaches of cooperative trajectory planning. It then presents an approach of finding a joint trajectory by modelling cooperative trajectory planning as an agreement process. A generally valid system structure is proposed for this purpose. Finally, it proposes concepts to implement cooperative trajectory planning as an agreement process.
Nature-inspired dynamic control for pursuit-evasion of robots
The pursuit-evasion problem is widespread in nature, engineering and societal applications. It is commonly observed in nature that a predator runs faster than its prey but it has less agile maneuverability. Over millions of years of evolution, animals have developed effective and efficient pursuit and evasion strategies. In this paper, we provide a dynamic framework for pursuit-evasion of unicycle systems from a nature-inspired perspective. Firstly, for the problem with one pursuer and one evader, we propose an Alert-Turn control strategy which consists of two efficient ingredients: the suddenly turning maneuver and the alert condition for starting and maintaining the maneuver. We present and analyze the escape and capture results at a lower level of a single run and at a higher level with respect to parameters' changes. A theorem with sufficient condition for capture is also given. Then, the Alert-Turn strategy is combined with aggregation control laws and a target-changing mechanism to model more complex phenomenons with multiple pursuers and evaders. By adjusting a selfish parameter, the aggregation control commands can achieve different escape patterns of evaders: cooperative mode, selfish mode, as well as their combinations, and the influence of the selfish parameter is quantified. We present the effects of the number of pursuers and the target-changing mechanism from a statistical perspective. Our findings are largely in line with observations in nature. Furthermore, our control strategies are verified by numerical simulations that replicate some chasing behaviors of animals in nature.
comment: 15 pages
Guiding Reinforcement Learning with Incomplete System Dynamics
Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness.
Fast State-of-Health Estimation Method for Lithium-ion Battery using Sparse Identification of Nonlinear Dynamics
Lithium-ion batteries (LIBs) are utilized as a major energy source in various fields because of their high energy density and long lifespan. During repeated charging and discharging, the degradation of LIBs, which reduces their maximum power output and operating time, is a pivotal issue. This degradation can affect not only battery performance but also safety of the system. Therefore, it is essential to accurately estimate the state-of-health (SOH) of the battery in real time. To address this problem, we propose a fast SOH estimation method that utilizes the sparse model identification algorithm (SINDy) for nonlinear dynamics. SINDy can discover the governing equations of target systems with low data assuming that few functions have the dominant characteristic of the system. To decide the state of degradation model, correlation analysis is suggested. Using SINDy and correlation analysis, we can obtain the data-driven SOH model to improve the interpretability of the system. To validate the feasibility of the proposed method, the estimation performance of the SOH and the computation time are evaluated by comparing it with various machine learning algorithms.
Global Stability Notions to Enhance the Rigor and Robustness of Adaptive Control
Stability theory plays a crucial role in feedback control. However, adaptive control theory requires advanced and specialized stability notions that are not frequently used in standard feedback control theory. The present document is a set of notes for a graduate course. It describes the global stability notions needed in (robust) adaptive control and develops the mathematical tools that are used for the proof of such stability properties. Moreover, the document shows why and how these global stability properties arise in adaptive control. We focus on stability properties for time-invariant systems. Consequently, tracking control problems are not covered by the present document.
comment: 48 pages, 4 figures
FastGEMF: Scalable High-Speed Simulation of Stochastic Spreading Processes over Complex Multilayer Networks
Predicting the spread of processes across complex multi-layered networks has long challenged researchers due to the intricate interplay between network structure and propagation dynamics. Each layer of these networks possesses unique characteristics, further complicating analysis. To authors' knowledge, a comprehensive framework capable of simulating various spreading processes across different layers, particularly in networks with millions of nodes and connections, has been notably absent. This study introduces a novel framework that efficiently predicts Markov Chain processes over large-scale networks, while significantly reducing time and space complexity. This approach enables exact simulation of spreading processes across extensive real-world multi-layer networks, accounting for diverse influencers on each layer. FastGEMF provides a baseline framework for exact simulating stochastic spread processes, facilitating comparative analysis of models across diverse domains, from epidemiology to social media dynamics.
Graph Neural Network-Accelerated Network-Reconfigured Optimal Power Flow
Optimal power flow (OPF) has been used for real-time grid operations. Prior efforts demonstrated that utilizing flexibility from dynamic topologies will improve grid efficiency. However, this will convert the linear OPF into a mixed-integer linear programming network-reconfigured OPF (NR-OPF) problem, substantially increasing the computing time. Thus, a machine learning (ML)-based approach, particularly utilizing graph neural network (GNN), is proposed to accelerate the solution process. The GNN model is trained offline to predict the best topology before entering the optimization stage. In addition, this paper proposes an offline pre-ML filter layer to reduce GNN model size and training time while improving its accuracy. A fast online post-ML selection layer is also proposed to analyze GNN predictions and then select a subset of predicted NR solutions with high confidence. Case studies have demonstrated superior performance of the proposed GNN-accelerated NR-OPF method augmented with the proposed pre-ML and post-ML layers.
AI-focused HPC Data Centers Can Provide More Power Grid Flexibility and at Lower Cost
The recent growth of Artificial Intelligence (AI), particularly large language models, requires energy-demanding high-performance computing (HPC) data centers, which poses a significant burden on power system capacity. Scheduling data center computing jobs to manage power demand can alleviate network stress with minimal infrastructure investment and contribute to fast time-scale power system balancing. This study, for the first time, comprehensively analyzes the capability and cost of grid flexibility provision by GPU-heavy AI-focused HPC data centers, along with a comparison with CPU-heavy general-purpose HPC data centers traditionally used for scientific computing. Using real-world data from 7 AI-focused HPC data centers, 7 general-purpose HPC data centers, and 3 cloud platforms, we find that AI-focused HPC data centers can offer greater flexibility at 50% lower cost for a range of power system services. By comparing the cost to flexibility market prices, we illustrate the financial profitability of flexibility provision for AI-focused HPC data centers.
comment: 22 pages (including supplementary materials and references), under review for Joule
Geometric Graph Neural Network Modeling of Human Interactions in Crowded Environments
Modeling human trajectories in crowded environments is challenging due to the complex nature of pedestrian behavior and interactions. This paper proposes a geometric graph neural network (GNN) architecture that integrates domain knowledge from psychological studies to model pedestrian interactions and predict future trajectories. Unlike prior studies using complete graphs, we define interaction neighborhoods using pedestrians' field of view, motion direction, and distance-based kernel functions to construct graph representations of crowds. Evaluations across multiple datasets demonstrate improved prediction accuracy through reduced average and final displacement error metrics. Our findings underscore the importance of integrating domain knowledge with data-driven approaches for effective modeling of human interactions in crowds.
comment: \c{opyright} 2024 the authors. This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-ND
Invisible Manipulation Deep Reinforcement Learning Enhanced Stealthy Attacks on Battery Energy Management Systems
This paper introduces "invisible manipulation", an innovative cyber-attack mechanism achieved through strategically timed stealthy false data injection attacks (SFDIAs). By stealthily manipulating measurements of a critical asset prior to the target time period, the attacker can subtly guide the engineering system toward a predetermined operational state without detection. Using the battery energy management system (BEMS) as a case study, we employ deep reinforcement learning (DRL) to generate synthetic measurements, such as battery voltage and current, that align closely with actual measurements. These synthetic measurements, falling within the acceptable error margin of residual-based bad data detection algorithm provided by state estimation, can evade detection and mislead Extended Kalman-filter-based State of Charge estimation. Subsequently, considering the deceptive data as valid inputs, the BEMS will operate the BESS towards the attacker desired operational states when the targeted time period come. The use of the DRL-based scheme allows us to covert an online optimization problem into an offline training process, thereby alleviating the computational burden for real-time implementation. Comprehensive testing on a high-fidelity microgrid real-time simulation testbed validates the effectiveness and adaptability of the proposed methods in achieving different attack objectives.
Preserving Privacy in Cloud-based Data-Driven Stabilization
In the recent years, we have observed three significant trends in control systems: a renewed interest in data-driven control design, the abundance of cloud computational services and the importance of preserving privacy for the system under control. Motivated by these factors, this work investigates privacy-preserving outsourcing for the design of a stabilizing controller for unknown linear time-invariant systems.The main objective of this research is to preserve the privacy for the system dynamics by designing an outsourcing mechanism. To achieve this goal, we propose a scheme that combines transformation-based techniques and robust data-driven control design methods. The scheme preserves the privacy of both the open-loop and closed-loop system matrices while stabilizing the system under control.The scheme is applicable to both data with and without disturbance and is lightweight in terms of computational overhead. Numerical investigations for a case study demonstrate the impacts of our mechanism and its role in hindering malicious adversaries from achieving their goals.
User Experience Evaluation of AR Assisted Industrial Maintenance and Support Applications
The paper introduces an innovative approach to industrial maintenance leveraging augmented reality (AR) technology, focusing on enhancing the user experience and efficiency. The shift from traditional to proactive maintenance strategies underscores the significance of maintenance in industrial systems. The proposed solution integrates AR interfaces, particularly through Head-Mounted Display (HMD) devices, to provide expert personnel-aided decision support for maintenance technicians, with the association of Artificial Intelligence (AI) solutions. The study explores the user experience aspect of AR interfaces in a simulated industrial environment, aiming to improve the maintenance processes' intuitiveness and effectiveness. Evaluation metrics such as the NASA Task Load Index (NASA-TLX) and the System Usability Scale (SUS) are employed to assess the usability, performance, and workload implications of the AR maintenance system. Additionally, the paper discusses the technical implementation, methodology, and results of experiments conducted to evaluate the effectiveness of the proposed solution.
Heuristic Search for Linear Positive Systems
This work considers infinite-horizon optimal control of positive linear systems applied to the case of network routing problems. We demonstrate the equivalence between Stochastic Shortest Path (SSP) problems and optimal control of a certain class of linear systems. This is used to construct a heuristic search framework for this class of linear systems inspired by existing methods for SSP. We propose a heuristics-based algorithm for finding local solutions to the analyzed class of optimal control problems with positive state and linear dynamics. More fundamentally, the results allow for analysis of the conditions for explicit solutions to the Bellman equation utilized by heuristic search methods.
comment: Preprint submitted to Automatica for review
Hierarchical Deep Learning Model for Degradation Prediction per Look-Ahead Scheduled Battery Usage Profile
Batteries can effectively improve the security of energy systems and mitigate climate change by facilitating wind and solar power. The installed capacity of battery energy storage system (BESS), mainly the lithium ion batteries are increasing significantly in recent years. However, the battery degradation cannot be accurately quantified and integrated into energy management system with existing heuristic battery degradation models. This paper proposed a hierarchical deep learning based battery degradation quantification (HDL-BDQ) model to quantify the battery degradation given scheduled BESS daily operations. Particularly, two sequential and cohesive deep neural networks are proposed to accurately estimate the degree of degradation using inputs of battery operational profiles and it can significantly outperform existing fixed or linear rate based degradation models as well as single-stage deep neural models. Training results show the high accuracy of the proposed system. Moreover, a learning and optimization decoupled algorithm is implemented to strategically take advantage of the proposed HDL-BDQ model in optimization-based look-ahead scheduling (LAS) problems. Case studies demonstrate the effectiveness of the proposed HDL-BDQ model in LAS of a microgrid testbed.
comment: 12 pages
Nonlinear Magnetics Model for Permanent Magnet Synchronous Machines Capturing Saturation and Temperature Effects
This paper proposes a nonlinear magnetics model for Permanent Magnet Synchronous Machines (PMSMs) that accurately captures the effects of magnetic saturation in the machine iron and variations in rotor temperature on the permanent magnet excitation. The proposed model considers the permanent magnet as a current source rather than the more commonly used flux-linkage source. A comparison of the two modelling approaches is conducted using Finite Element Analysis (FEA) for different machine designs as well as experimental validation, where it is shown that the proposed model has substantially better accuracy. The proposed model decouples magnetic saturation and rotor temperature effects in the current/flux-linkage relationship, allowing for adaptive estimation of the PM excitation.
k-Dimensional Agreement in Multiagent Systems
Given a network of agents, we study the problem of designing a distributed algorithm that computes k independent weighted means of the network's initial conditions (namely, the agents agree on a k-dimensional space). Akin to average consensus, this problem finds applications in distributed computing and sensing, where agents seek to simultaneously evaluate k independent functions at a common point by running a single coordination algorithm. We show that linear algorithms can agree on quantities that are oblique projections of the vector of initial conditions, and we provide techniques to design protocols that are compatible with a pre-specified communication graph. More broadly, our results show that a single agreement algorithm can solve $k$ consensus problems simultaneously at a fraction of the complexity of classical approaches but, in general, it requires higher network connectivity.
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC ICRA
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses.
comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
comment: 6 pages, 8 figures
IoT-Based Water Quality Monitoring System in Philippine Off-Grid Communities
Contaminated and polluted water poses significant threats to human health, necessitating vigilant monitoring of water sources for potential contamination. This paper introduces a low-cost Internet of Things (IoT)-based water quality monitoring system designed to address water quality challenges in rural communities, as demonstrated through a case study conducted in the Philippines. The system consists of two core components. The hardware component of the system, built on Arduino technology and featuring real-time data transmission, focuses on monitoring pH levels, turbidity, and temperature via sensors. The system is equipped to transmit data to a cloud database and send informative messages to mobile numbers, updating users on the status of water supplies. The application component acts as a user interface for accessing and managing data collected by the sensors. The successful deployment of this Water Quality Monitoring (WQM) system not only helps community leaders and health workers monitor water sources but also underscores its potential to empower communities in safeguarding their water sources, thereby contributing to the advancement of clean and safe water access.
comment: Proceedings of the 2024 9th International Conference on Business and Industrial Research, May 2024, Bangkok, Thailand
Efficient pseudometrics for data-driven comparisons of nonlinear dynamical systems
Computationally efficient solutions for pseudometrics quantifying deviation from topological conjugacy between dynamical systems are presented. Deviation from conjugacy is quantified in a Pareto optimal sense that accounts for spectral properties of Koopman operators as well as trajectory geometry. Theoretical justification is provided for computing such pseudometrics in Koopman eigenfunction space rather than observable space. Furthermore, it is shown that theoretical consistency with topological conjugacy can be maintained when restricting the search for optimal transformations between systems to the unitary group. Therefore the pseudometrics are based on analytical solutions for unitary transformations in Koopman eigenfunction space. Geometric considerations for the deviation from conjugacy Pareto optimality problem are used to develop scalar pseudometrics that account for all possible optimal solutions given just two Pareto points. The approach is demonstrated on two example problems; the first being a simple benchmarking problem and the second an engineering example comparing the dynamics of morphological computation of biological nonlinear muscle actuators to simplified mad-made (including bioinspired) approaches. The benefits of considering operator and trajectory geometry based dissimilarity measures in a unified and consistent formalism is demonstrated. Overall, the deviation from conjugacy pseudometrics provide practical advantages in terms of efficiency and scalability, while maintaining theoretical consistency.
comment: minor edits
Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control
Empowering resource-limited robots to execute computationally intensive tasks such as locomotion and manipulation is challenging. This project provides a comprehensive design space exploration to determine optimal hardware computation architectures suitable for model-based control algorithms. We profile and optimize representative architectural designs across general-purpose scalar, vector processors, and specialized accelerators. Specifically, we compare CPUs, vector machines, and domain-specialized accelerators with kernel-level benchmarks and end-to-end representative robotic workloads. Our exploration provides a quantitative performance, area, and utilization comparison and analyzes the trade-offs between these representative distinct architectural designs. We demonstrate that architectural modifications, software, and system optimization can alleviate bottlenecks and enhance utilization. Finally, we propose a code generation flow to simplify the engineering work for mapping robotic workloads to specialized architectures.
Robotics
Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos
We present Agent-to-Sim (ATS), a framework for learning interactive behavior models of 3D agents from casual longitudinal video collections. Different from prior works that rely on marker-based tracking and multiview cameras, ATS learns natural behaviors of animal and human agents non-invasively through video observations recorded over a long time-span (e.g., a month) in a single environment. Modeling 3D behavior of an agent requires persistent 3D tracking (e.g., knowing which point corresponds to which) over a long time period. To obtain such data, we develop a coarse-to-fine registration method that tracks the agent and the camera over time through a canonical 3D space, resulting in a complete and persistent spacetime 4D representation. We then train a generative model of agent behaviors using paired data of perception and motion of an agent queried from the 4D reconstruction. ATS enables real-to-sim transfer from video recordings of an agent to an interactive behavior simulator. We demonstrate results on pets (e.g., cat, dog, bunny) and human given monocular RGBD videos captured by a smartphone.
comment: Project page: https://gengshan-y.github.io/agent2sim-www/
CoT-TL: Low-Resource Temporal Knowledge Representation of Planning Instructions Using Chain-of-Thought Reasoning IROS 2024
Autonomous agents often face the challenge of interpreting uncertain natural language instructions for planning tasks. Representing these instructions as Linear Temporal Logic (LTL) enables planners to synthesize actionable plans. We introduce CoT-TL, a data-efficient in-context learning framework for translating natural language specifications into LTL representations. CoT-TL addresses the limitations of large language models, which typically rely on extensive fine-tuning data, by extending chain-of-thought reasoning and semantic roles to align with the requirements of formal logic creation. This approach enhances the transparency and rationale behind LTL generation, fostering user trust. CoT-TL achieves state-of-the-art accuracy across three diverse datasets in low-data scenarios, outperforming existing methods without fine-tuning or intermediate translations. To improve reliability and minimize hallucinations, we incorporate model checking to validate the syntax of the generated LTL output. We further demonstrate CoT-TL's effectiveness through ablation studies and evaluations on unseen LTL structures and formulas in a new dataset. Finally, we validate CoT-TL's practicality by integrating it into a QuadCopter for multi-step drone planning based on natural language instructions.
comment: Accepted for publication in Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024), Abu Dhabi 14-18 October 2024
LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation
Autonomous Driving Systems (ADS) require diverse and safety-critical traffic scenarios for effective training and testing, but the existing data generation methods struggle to provide flexibility and scalability. We propose LASER, a novel frame-work that leverage large language models (LLMs) to conduct traffic simulations based on natural language inputs. The framework operates in two stages: it first generates scripts from user-provided descriptions and then executes them using autonomous agents in real time. Validated in the CARLA simulator, LASER successfully generates complex, on-demand driving scenarios, significantly improving ADS training and testing data generation.
Continuum Robot Shape Estimation Using Magnetic Ball Chains
Shape sensing of medical continuum robots is important both for closed-loop control as well as for enabling the clinician to visualize the robot inside the body. There is a need for inexpensive, but accurate shape sensing technologies. This paper proposes the use of magnetic ball chains as a means of generating shape-specific magnetic fields that can be detected by an external array of Hall effect sensors. Such a ball chain, encased in a flexible polymer sleeve, could be inserted inside the lumen of any continuum robot to provide real-time shape feedback. The sleeve could be removed, as needed, during the procedure to enable use of the entire lumen. To investigate this approach, a shape-sensing model for a steerable catheter tip is derived and an observability and sensitivity analysis are presented. Experiments show maximum estimation errors of 7.1% and mean of 2.9% of the tip position with respect to total length.
ARCADE: Scalable Demonstration Collection and Generation via Augmented Reality for Imitation Learning
Robot Imitation Learning (IL) is a crucial technique in robot learning, where agents learn by mimicking human demonstrations. However, IL encounters scalability challenges stemming from both non-user-friendly demonstration collection methods and the extensive time required to amass a sufficient number of demonstrations for effective training. In response, we introduce the Augmented Reality for Collection and generAtion of DEmonstrations (ARCADE) framework, designed to scale up demonstration collection for robot manipulation tasks. Our framework combines two key capabilities: 1) it leverages AR to make demonstration collection as simple as users performing daily tasks using their hands, and 2) it enables the automatic generation of additional synthetic demonstrations from a single human-derived demonstration, significantly reducing user effort and time. We assess ARCADE's performance on a real Fetch robot across three robotics tasks: 3-Waypoints-Reach, Push, and Pick-And-Place. Using our framework, we were able to rapidly train a policy using vanilla Behavioral Cloning (BC), a classic IL algorithm, which excelled across these three tasks. We also deploy ARCADE on a real household task, Pouring-Water, achieving an 80% success rate.
Analyzing Closed-loop Training Techniques for Realistic Traffic Agent Models in Autonomous Highway Driving Simulations
Simulation plays a crucial role in the rapid development and safe deployment of autonomous vehicles. Realistic traffic agent models are indispensable for bridging the gap between simulation and the real world. Many existing approaches for imitating human behavior are based on learning from demonstration. However, these approaches are often constrained by focusing on individual training strategies. Therefore, to foster a broader understanding of realistic traffic agent modeling, in this paper, we provide an extensive comparative analysis of different training principles, with a focus on closed-loop methods for highway driving simulation. We experimentally compare (i) open-loop vs. closed-loop multi-agent training, (ii) adversarial vs. deterministic supervised training, (iii) the impact of reinforcement losses, and (iv) the impact of training alongside log-replayed agents to identify suitable training techniques for realistic agent modeling. Furthermore, we identify promising combinations of different closed-loop training methods.
comment: 15 pages, 6 figures, 4 tables
Learning Quadrotor Control From Visual Features Using Differentiable Simulation
The sample inefficiency of reinforcement learning (RL) remains a significant challenge in robotics. RL requires large-scale simulation and, still, can cause long training times, slowing down research and innovation. This issue is particularly pronounced in vision-based control tasks where reliable state estimates are not accessible. Differentiable simulation offers an alternative by enabling gradient back-propagation through the dynamics model, providing low-variance analytical policy gradients and, hence, higher sample efficiency. However, its usage for real-world robotic tasks has yet been limited. This work demonstrates the great potential of differentiable simulation for learning quadrotor control. We show that training in differentiable simulation significantly outperforms model-free RL in terms of both sample efficiency and training time, allowing a policy to learn to recover a quadrotor in seconds when providing vehicle state and in minutes when relying solely on visual features. The key to our success is two-fold. First, the use of a simple surrogate model for gradient computation greatly accelerates training without sacrificing control performance. Second, combining state representation learning with policy learning enhances convergence speed in tasks where only visual features are observable. These findings highlight the potential of differentiable simulation for real-world robotics and offer a compelling alternative to conventional RL approaches.
comment: Under Submission
Diffusion Transformer Policy
Recent large visual-language action models pretrained on diverse robot datasets have demonstrated the potential for generalizing to new environments with a few in-domain data. However, those approaches usually predict discretized or continuous actions by a small action head, which limits the ability in handling diverse action spaces. In contrast, we model the continuous action with a large multi-modal diffusion transformer, dubbed as Diffusion Transformer Policy, in which we directly denoise action chunks by a large transformer model rather than a small action head. By leveraging the scaling capability of transformers, the proposed approach can effectively model continuous end-effector actions across large diverse robot datasets, and achieve better generalization performance. Extensive experiments demonstrate Diffusion Transformer Policy pretrained on diverse robot data can generalize to different embodiments, including simulation environments like Maniskill2 and Calvin, as well as the real-world Franka arm. Specifically, without bells and whistles, the proposed approach achieves state-of-the-art performance with only a single third-view camera stream in the Calvin novel task setting (ABC->D), improving the average number of tasks completed in a row of 5 to 3.6, and the pretraining stage significantly facilitates the success sequence length on the Calvin by over 1.2. The code will be publicly available.
comment: Preprint
Neural Predictor for Flight Control with Payload
Aerial robotics for transporting suspended payloads as the form of freely-floating manipulator are growing great interest in recent years. However, the prior information of the payload, such as the mass, is always hard to obtain accurately in practice. The force/torque caused by payload and residual dynamics will introduce unmodeled perturbations to the system, which negatively affects the closed-loop performance. Different from estimation-like methods, this paper proposes Neural Predictor, a learning-based approach to model force/torque caused by payload and residual dynamics as a dynamical system. It results a hybrid model including both the first-principles dynamics and the learned dynamics. This hybrid model is then integrated into a MPC framework to improve closed-loop performance. Effectiveness of proposed framework is verified extensively in both numerical simulations and real-world flight experiments. The results indicate that our approach can capture force/torque caused by payload and residual dynamics accurately, respond quickly to the changes of them and improve the closed-loop performance significantly. In particular, Neural Predictor outperforms a state-of-the-art learning-based estimator and has reduced the force and torque estimation errors by up to 66.15% and 33.33% while using less samples.
comment: 8 pages
Fully distributed and resilient source seeking for robot swarms
We propose a self-contained, resilient and fully distributed solution for locating the maximum of an unknown 3D scalar field using a swarm of robots that travel at constant speeds. Unlike conventional reactive methods relying on gradient information, our methodology enables the swarm to determine an ascending direction so that it approaches the source with arbitrary precision. Our source-seeking solution consists of three algorithms. The first two algorithms run sequentially and distributively at a high frequency providing barycentric coordinates and the ascending direction respectively to the individual robots. The third algorithm is the individual control law for a robot to track the estimated ascending direction. We show that the two algorithms with higher frequency have an exponential convergence to their eventual values since they are based on the standard consensus protocol for first-order dynamical systems; their high frequency depends on how fast the robots travel through the scalar field. The robots are not constrained to any particular geometric formation, and we study both discrete and continuous distributions of robots within swarm shapes. The shape analysis reveals the resiliency of our approach as expected in robot swarms, i.e., by amassing robots we ensure the source-seeking functionality in the event of missing or misplaced individuals or even if the robot network splits into two or more disconnected subnetworks. In addition, we also enhance the robustness of the algorithm by presenting conditions for \emph{optimal} swarm shapes, in the sense that the ascending directions can be closely parallel to the field's gradient. We exploit such an analysis so that the swarm can adapt to unknown environments by morphing its shape and maneuvering while still following an ascending direction.
comment: 15 pages, submitted version to T-RO. This version does not contain the field experiments. arXiv admin note: text overlap with arXiv:2309.02937
Bench4Merge: A Comprehensive Benchmark for Merging in Realistic Dense Traffic with Micro-Interactive Vehicles
While the capabilities of autonomous driving have advanced rapidly, merging into dense traffic remains a significant challenge, many motion planning methods for this scenario have been proposed but it is hard to evaluate them. Most existing closed-loop simulators rely on rule-based controls for other vehicles, which results in a lack of diversity and randomness, thus failing to accurately assess the motion planning capabilities in highly interactive scenarios. Moreover, traditional evaluation metrics are insufficient for comprehensively evaluating the performance of merging in dense traffic. In response, we proposed a closed-loop evaluation benchmark for assessing motion planning capabilities in merging scenarios. Our approach involves other vehicles trained in large scale datasets with micro-behavioral characteristics that significantly enhance the complexity and diversity. Additionally, we have restructured the evaluation mechanism by leveraging large language models to assess each autonomous vehicle merging onto the main road. Extensive experiments have demonstrated the advanced nature of this evaluation benchmark. Through this benchmark, we have obtained an evaluation of existing methods and identified common issues. The environment and vehicle motion planning models we have designed can be accessed at https://anonymous.4open.science/r/Bench4Merge-EB5D
comment: 6 pages, 7 figures, IEEE international conference on robotics and automation
Distributed Learning for UAV Swarms
Unmanned Aerial Vehicle (UAV) swarms are increasingly deployed in dynamic, data-rich environments for applications such as environmental monitoring and surveillance. These scenarios demand efficient data processing while maintaining privacy and security, making Federated Learning (FL) a promising solution. FL allows UAVs to collaboratively train global models without sharing raw data, but challenges arise due to the non-Independent and Identically Distributed (non-IID) nature of the data collected by UAVs. In this study, we show an integration of the state-of-the-art FL methods to UAV Swarm application and invetigate the performance of multiple aggregation methods (namely FedAvg, FedProx, FedOpt, and MOON) with a particular focus on tackling non-IID on a variety of datasets, specifically MNIST for baseline performance, CIFAR10 for natural object classification, EuroSAT for environment monitoring, and CelebA for surveillance. These algorithms were selected to cover improved techniques on both client-side updates and global aggregation. Results show that while all algorithms perform comparably on IID data, their performance deteriorates significantly under non-IID conditions. FedProx demonstrated the most stable overall performance, emphasising the importance of regularising local updates in non-IID environments to mitigate drastic deviations in local models.
Triplane Grasping: Efficient 6-DoF Grasping with Single RGB Images
Reliable object grasping is one of the fundamental tasks in robotics. However, determining grasping pose based on single-image input has long been a challenge due to limited visual information and the complexity of real-world objects. In this paper, we propose Triplane Grasping, a fast grasping decision-making method that relies solely on a single RGB-only image as input. Triplane Grasping creates a hybrid Triplane-Gaussian 3D representation through a point decoder and a triplane decoder, which produce an efficient and high-quality reconstruction of the object to be grasped to meet real-time grasping requirements. We propose to use an end-to-end network to generate 6-DoF parallel-jaw grasp distributions directly from 3D points in the point cloud as potential grasp contacts and anchor the grasp pose in the observed data. Experiments demonstrate that our method achieves rapid modeling and grasping pose decision-making for daily objects, and exhibits a high grasping success rate in zero-shot scenarios.
Safety-critical Control with Control Barrier Functions: A Hierarchical Optimization Framework
The control barrier function (CBF) has become a fundamental tool in safety-critical systems design since its invention. Typically, the quadratic optimization framework is employed to accommodate CBFs, control Lyapunov functions (CLFs), other constraints and nominal control design. However, the constrained optimization framework involves hyper-parameters to tradeoff different objectives and constraints, which, if not well-tuned beforehand, impact system performance and even lead to infeasibility. In this paper, we propose a hierarchical optimization framework that decomposes the multi-objective optimization problem into nested optimization sub-problems in a safety-first approach. The new framework addresses potential infeasibility on the premise of ensuring safety and performance as much as possible and applies easily in multi-certificate cases. With vivid visualization aids, we systematically analyze the advantages of our proposed method over existing QP-based ones in terms of safety, feasibility and convergence rates. Moreover, two numerical examples are provided that verify our analysis and show the superiority of our proposed method.
Robust Loop Closure by Textual Cues in Challenging Environments
Loop closure is an important task in robot navigation. However, existing methods mostly rely on some implicit or heuristic features of the environment, which can still fail to work in common environments such as corridors, tunnels, and warehouses. Indeed, navigating in such featureless, degenerative, and repetitive (FDR) environments would also pose a significant challenge even for humans, but explicit text cues in the surroundings often provide the best assistance. This inspires us to propose a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments. Specifically, our approach first extracts scene text entities based on Optical Character Recognition (OCR), then creates a local map of text cues based on accurate LiDAR odometry and finally identifies loop closure events by a graph-theoretic scheme. Experiment results demonstrate that this approach has superior performance over existing methods that rely solely on visual and LiDAR sensors. To benefit the community, we release the source code and datasets at \url{https://github.com/TongxingJin/TXTLCD}.
Task-oriented Robotic Manipulation with Vision Language Models
Vision-Language Models (VLMs) play a crucial role in robotic manipulation by enabling robots to understand and interpret the visual properties of objects and their surroundings, allowing them to perform manipulation based on this multimodal understanding. However, understanding object attributes and spatial relationships is a non-trivial task but is critical in robotic manipulation tasks. In this work, we present a new dataset focused on spatial relationships and attribute assignment and a novel method to utilize VLMs to perform object manipulation with task-oriented, high-level input. In this dataset, the spatial relationships between objects are manually described as captions. Additionally, each object is labeled with multiple attributes, such as fragility, mass, material, and transparency, derived from a fine-tuned vision language model. The embedded object information from captions are automatically extracted and transformed into a data structure (in this case, tree, for demonstration purposes) that captures the spatial relationships among the objects within each image. The tree structures, along with the object attributes, are then fed into a language model to transform into a new tree structure that determines how these objects should be organized in order to accomplish a specific (high-level) task. We demonstrate that our method not only improves the comprehension of spatial relationships among objects in the visual environment but also enables robots to interact with these objects more effectively. As a result, this approach significantly enhances spatial reasoning in robotic manipulation tasks. To our knowledge, this is the first method of its kind in the literature, offering a novel solution that allows robots to more efficiently organize and utilize objects in their surroundings.
Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning
Geomagnetic navigation has drawn increasing attention with its capacity in navigating through complex environments and its independence from external navigation services like global navigation satellite systems (GNSS). Existing studies on geomagnetic navigation, i.e., matching navigation and bionic navigation, rely on pre-stored map or extensive searches, leading to limited applicability or reduced navigation efficiency in unexplored areas. To address the issues with geomagnetic navigation in areas where GNSS is unavailable, this paper develops a deep reinforcement learning (DRL)-based mechanism, especially for long-distance geomagnetic navigation. The designed mechanism trains an agent to learn and gain the magnetoreception capacity for geomagnetic navigation, rather than using any pre-stored map or extensive and expensive searching approaches. Particularly, we integrate the geomagnetic gradient-based parallel approach into geomagnetic navigation. This integration mitigates the over-exploration of the learning agent by adjusting the geomagnetic gradient, such that the obtained gradient is aligned towards the destination. We explore the effectiveness of the proposed approach via detailed numerical simulations, where we implement twin delayed deep deterministic policy gradient (TD3) in realizing the proposed approach. The results demonstrate that our approach outperforms existing metaheuristic and bionic navigation methods in long-distance missions under diverse navigation conditions.
Assisted Physical Interaction: Autonomous Aerial Robots with Neural Network Detection, Navigation, and Safety Layers
The paper introduces a novel framework for safe and autonomous aerial physical interaction in industrial settings. It comprises two main components: a neural network-based target detection system enhanced with edge computing for reduced onboard computational load, and a control barrier function (CBF)-based controller for safe and precise maneuvering. The target detection system is trained on a dataset under challenging visual conditions and evaluated for accuracy across various unseen data with changing lighting conditions. Depth features are utilized for target pose estimation, with the entire detection framework offloaded into low-latency edge computing. The CBF-based controller enables the UAV to converge safely to the target for precise contact. Simulated evaluations of both the controller and target detection are presented, alongside an analysis of real-world detection performance.
comment: 8 pages,14 figures, ICUAS 2024
Flying through Moving Gates without Full State Estimation
Autonomous drone racing requires powerful perception, planning, and control and has become a benchmark and test field for autonomous, agile flight. Existing work usually assumes static race tracks with known maps, which enables offline planning of time-optimal trajectories, performing localization to the gates to reduce the drift in visual-inertial odometry (VIO) for state estimation or training learning-based methods for the particular race track and operating environment. In contrast, many real-world tasks like disaster response or delivery need to be performed in unknown and dynamic environments. To close this gap and make drone racing more robust against unseen environments and moving gates, we propose a control algorithm that does not require a race track map or VIO and uses only monocular measurements of the line of sight (LOS) to the gates. For this purpose, we adopt the law of proportional navigation (PN) to accurately fly through the gates despite gate motions or wind. We formulate the PN-informed vision-based control problem for drone racing as a constrained optimization problem and derive a closed-form optimal solution. We demonstrate through extensive simulations and real-world experiments that our method can navigate through moving gates at high speeds while being robust to different gate movements, model errors, wind, and delays.
comment: 7 pages, 6 figures
Design of a Flexible Robot Arm for Safe Aerial Physical Interaction
This paper introduces a novel compliant mechanism combining lightweight and energy dissipation for aerial physical interaction. Weighting 400~g at take-off, the mechanism is actuated in the forward body direction, enabling precise position control for force interaction and various other aerial manipulation tasks. The robotic arm, structured as a closed-loop kinematic chain, employs two deported servomotors. Each joint is actuated with a single tendon for active motion control in compression of the arm at the end-effector. Its elasto-mechanical design reduces weight and provides flexibility, allowing passive-compliant interactions without impacting the motors' integrity. Notably, the arm's damping can be adjusted based on the proposed inner frictional bulges. Experimental applications showcase the aerial system performance in both free-flight and physical interaction. The presented work may open safer applications for \ac{MAV} in real environments subject to perturbations during interaction.
comment: 6 pages, 7 figures, ROBOSOFT 2024
WildOcc: A Benchmark for Off-Road 3D Semantic Occupancy Prediction
3D semantic occupancy prediction is an essential part of autonomous driving, focusing on capturing the geometric details of scenes. Off-road environments are rich in geometric information, therefore it is suitable for 3D semantic occupancy prediction tasks to reconstruct such scenes. However, most of researches concentrate on on-road environments, and few methods are designed for off-road 3D semantic occupancy prediction due to the lack of relevant datasets and benchmarks. In response to this gap, we introduce WildOcc, to our knowledge, the first benchmark to provide dense occupancy annotations for off-road 3D semantic occupancy prediction tasks. A ground truth generation pipeline is proposed in this paper, which employs a coarse-to-fine reconstruction to achieve a more realistic result. Moreover, we introduce a multi-modal 3D semantic occupancy prediction framework, which fuses spatio-temporal information from multi-frame images and point clouds at voxel level. In addition, a cross-modality distillation function is introduced, which transfers geometric knowledge from point clouds to image features.
Generalizing Motion Planners with Mixture of Experts for Autonomous Driving
Large real-world driving datasets have sparked significant research into various aspects of data-driven motion planners for autonomous driving. These include data augmentation, model architecture, reward design, training strategies, and planner pipelines. These planners promise better generalizations on complicated and few-shot cases than previous methods. However, experiment results show that many of these approaches produce limited generalization abilities in planning performance due to overly complex designs or training paradigms. In this paper, we review and benchmark previous methods focusing on generalizations. The experimental results indicate that as models are appropriately scaled, many design elements become redundant. We introduce StateTransformer-2 (STR2), a scalable, decoder-only motion planner that uses a Vision Transformer (ViT) encoder and a mixture-of-experts (MoE) causal Transformer architecture. The MoE backbone addresses modality collapse and reward balancing by expert routing during training. Extensive experiments on the NuPlan dataset show that our method generalizes better than previous approaches across different test sets and closed-loop simulations. Furthermore, we assess its scalability on billions of real-world urban driving scenarios, demonstrating consistent accuracy improvements as both data and model size grow.
comment: 7 pages, 3 figures
MSGField: A Unified Scene Representation Integrating Motion, Semantics, and Geometry for Robotic Manipulation
Combining accurate geometry with rich semantics has been proven to be highly effective for language-guided robotic manipulation. Existing methods for dynamic scenes either fail to update in real-time or rely on additional depth sensors for simple scene editing, limiting their applicability in real-world. In this paper, we introduce MSGField, a representation that uses a collection of 2D Gaussians for high-quality reconstruction, further enhanced with attributes to encode semantic and motion information. Specially, we represent the motion field compactly by decomposing each primitive's motion into a combination of a limited set of motion bases. Leveraging the differentiable real-time rendering of Gaussian splatting, we can quickly optimize object motion, even for complex non-rigid motions, with image supervision from only two camera views. Additionally, we designed a pipeline that utilizes object priors to efficiently obtain well-defined semantics. In our challenging dataset, which includes flexible and extremely small objects, our method achieve a success rate of 79.2% in static and 63.3% in dynamic environments for language-guided manipulation. For specified object grasping, we achieve a success rate of 90%, on par with point cloud-based methods. Code and dataset will be released at:https://shengyu724.github.io/MSGField.github.io.
Efficient Non-Myopic Layered Bayesian Optimization For Large-Scale Bathymetric Informative Path Planning ICRA
Informative path planning (IPP) applied to bathymetric mapping allows AUVs to focus on feature-rich areas to quickly reduce uncertainty and increase mapping efficiency. Existing methods based on Bayesian optimization (BO) over Gaussian Process (GP) maps work well on small scenarios but they are short-sighted and computationally heavy when mapping larger areas, hindering deployment in real applications. To overcome this, we present a 2-layered BO IPP method that performs non-myopic, real-time planning in a tree search fashion over large Stochastic Variational GP maps, while respecting the AUV motion constraints and accounting for localization uncertainty. Our framework outperforms the standard industrial lawn-mowing pattern and a myopic baseline in a set of hardware in the loop (HIL) experiments in an embedded platform over real bathymetry.
comment: 6 pages + 1 page of references, 4 figures, submitted to International Conference on Robotics and Automation (ICRA)
Hierarchical Search-Based Cooperative Motion Planning
Cooperative path planning, a crucial aspect of multi-agent systems research, serves a variety of sectors, including military, agriculture, and industry. Many existing algorithms, however, come with certain limitations, such as simplified kinematic models and inadequate support for multiple group scenarios. Focusing on the planning problem associated with a nonholonomic Ackermann model for Unmanned Ground Vehicles (UGV), we propose a leaderless, hierarchical Search-Based Cooperative Motion Planning (SCMP) method. The high-level utilizes a binary conflict search tree to minimize runtime, while the low-level fabricates kinematically feasible, collision-free paths that are shape-constrained. Our algorithm can adapt to scenarios featuring multiple groups with different shapes, outlier agents, and elaborate obstacles. We conduct algorithm comparisons, performance testing, simulation, and real-world testing, verifying the effectiveness and applicability of our algorithm. The implementation of our method will be open-sourced at https://github.com/WYCUniverStar/SCMP.
PALMS: Plane-based Accessible Indoor Localization Using Mobile Smartphones
In this paper, we present PALMS, an innovative indoor global localization and relocalization system for mobile smartphones that utilizes publicly available floor plans. Unlike most vision-based methods that require constant visual input, our system adopts a dynamic form of localization that considers a single instantaneous observation and odometry data. The core contribution of this work is the introduction of a particle filter initialization method that leverages the Certainly Empty Space (CES) constraint along with principal orientation matching. This approach creates a spatial probability distribution of the device's location, significantly improving localization accuracy and reducing particle filter convergence time. Our experimental evaluations demonstrate that PALMS outperforms traditional methods with uniformly initialized particle filters, providing a more efficient and accessible approach to indoor wayfinding. By eliminating the need for prior environmental fingerprinting, PALMS provides a scalable and practical approach to indoor navigation.
comment: 7 pages, 3 figures, accepted to the 14th International Conference on Indoor Positioning and Indoor Navigation (IPIN) 2024, Best Presentation Award
RANSAC Back to SOTA: A Two-stage Consensus Filtering for Real-time 3D Registration
Correspondence-based point cloud registration (PCR) plays a key role in robotics and computer vision. However, challenges like sensor noises, object occlusions, and descriptor limitations inevitably result in numerous outliers. RANSAC family is the most popular outlier removal solution. However, the requisite iterations escalate exponentially with the outlier ratio, rendering it far inferior to existing methods (SC2PCR [1], MAC [2], etc.) in terms of accuracy or speed. Thus, we propose a two-stage consensus filtering (TCF) that elevates RANSAC to state-of-the-art (SOTA) speed and accuracy. Firstly, one-point RANSAC obtains a consensus set based on length consistency. Subsequently, two-point RANSAC refines the set via angle consistency. Then, three-point RANSAC computes a coarse pose and removes outliers based on transformed correspondence's distances. Drawing on optimizations from one-point and two-point RANSAC, three-point RANSAC requires only a few iterations. Eventually, an iterative reweighted least squares (IRLS) is applied to yield the optimal pose. Experiments on the large-scale KITTI and ETH datasets demonstrate our method achieves up to three-orders-of-magnitude speedup compared to MAC while maintaining registration accuracy and recall. Our code is available at https://github.com/ShiPC-AI/TCF.
comment: 8 pages, 8 figures
Reinforced Imitative Trajectory Planning for Urban Automated Driving
Reinforcement learning (RL) faces challenges in trajectory planning for urban automated driving due to the poor convergence of RL and the difficulty in designing reward functions. The convergence problem is alleviated by combining RL with supervised learning. However, most existing approaches only reason one step ahead and lack the capability to plan for multiple future steps. Besides, although inverse reinforcement learning holds promise for solving the reward function design issue, existing methods for automated driving impose a linear structure assumption on reward functions, making them difficult to apply to urban automated driving. In light of these challenges, this paper proposes a novel RL-based trajectory planning method that integrates RL with imitation learning to enable multi-step planning. Furthermore, a transformer-based Bayesian reward function is developed, providing effective reward signals for RL in urban scenarios. Moreover, a hybrid-driven trajectory planning framework is proposed to enhance safety and interpretability. The proposed methods were validated on the large-scale real-world urban automated driving nuPlan dataset. The results demonstrated the significant superiority of the proposed methods over the baselines in terms of the closed-loop metrics. The code is available at https://github.com/Zigned/nuplan_zigned.
comment: 19 pages, 9 figures
Patrol Security Game: Defending Against Adversary with Freedom in Attack Timing, Location, and Duration
We explored the Patrol Security Game (PSG), a robotic patrolling problem modeled as an extensive-form Stackelberg game, where the attacker determines the timing, location, and duration of their attack. Our objective is to devise a patrolling schedule with an infinite time horizon that minimizes the attacker's payoff. We demonstrated that PSG can be transformed into a combinatorial minimax problem with a closed-form objective function. By constraining the defender's strategy to a time-homogeneous first-order Markov chain (i.e., the patroller's next move depends solely on their current location), we proved that the optimal solution in cases of zero penalty involves either minimizing the expected hitting time or return time, depending on the attacker model, and that these solutions can be computed efficiently. Additionally, we observed that increasing the randomness in the patrol schedule reduces the attacker's expected payoff in high-penalty cases. However, the minimax problem becomes non-convex in other scenarios. To address this, we formulated a bi-criteria optimization problem incorporating two objectives: expected maximum reward and entropy. We proposed three graph-based algorithms and one deep reinforcement learning model, designed to efficiently balance the trade-off between these two objectives. Notably, the third algorithm can identify the optimal deterministic patrol schedule, though its runtime grows exponentially with the number of patrol spots. Experimental results validate the effectiveness and scalability of our solutions, demonstrating that our approaches outperform state-of-the-art baselines on both synthetic and real-world crime datasets.
comment: Under review of TCPS
Development of Minimal Biorobotic Stealth Distance and Its Application in the Design of Direct-Drive Dragonfly-Inspired Aircraft
This paper introduces the Minimal Biorobotic Stealth Distance (MBSD), a novel quantitative metric to evaluate the bionic resemblance of biorobotic aircraft. Current technological limitations prevent dragonfly-inspired aircrafts from achieving optimal performance at biological scales. To address these challenges, we use the DDD-1 dragonfly-inspired aircraft, a hover-capable direct-drive aircraft, to explore the impact of the MBSD on aircraft design. Key contributions of this research include: (1) the establishment of the MBSD as a quantifiable and operable evaluation metric that influences aircraft design, integrating seamlessly with the overall design process and providing a new dimension for optimizing bionic aircraft, balancing mechanical attributes and bionic characteristics; (2) the creation and analysis of a typical aircraft in four directions: essential characteristics of the MBSD, its coupling relationship with existing performance metrics (Longest Hover Duration and Maximum Instantaneous Forward Flight Speed), multi-objective optimization, and application in a typical mission scenario; (3) the construction and validation of a full-system model for the direct-drive dragonfly-inspired aircraft, demonstrating the design model's effectiveness against existing aircraft data. Detailed calculations of the MBSD consider appearance similarity, dynamic similarity, and environmental similarity.
comment: 61 pages, 32 figures
A Plug-and-Play Fully On-the-Job Real-Time Reinforcement Learning Algorithm for a Direct-Drive Tandem-Wing Experiment Platforms Under Multiple Random Operating Conditions
The nonlinear and unstable aerodynamic interference generated by the tandem wings of such biomimetic systems poses substantial challenges for motion control, especially under multiple random operating conditions. To address these challenges, the Concerto Reinforcement Learning Extension (CRL2E) algorithm has been developed. This plug-and-play, fully on-the-job, real-time reinforcement learning algorithm incorporates a novel Physics-Inspired Rule-Based Policy Composer Strategy with a Perturbation Module alongside a lightweight network optimized for real-time control. To validate the performance and the rationality of the module design, experiments were conducted under six challenging operating conditions, comparing seven different algorithms. The results demonstrate that the CRL2E algorithm achieves safe and stable training within the first 500 steps, improving tracking accuracy by 14 to 66 times compared to the Soft Actor-Critic, Proximal Policy Optimization, and Twin Delayed Deep Deterministic Policy Gradient algorithms. Additionally, CRL2E significantly enhances performance under various random operating conditions, with improvements in tracking accuracy ranging from 8.3% to 60.4% compared to the Concerto Reinforcement Learning (CRL) algorithm. The convergence speed of CRL2E is 36.11% to 57.64% faster than the CRL algorithm with only the Composer Perturbation and 43.52% to 65.85% faster than the CRL algorithm when both the Composer Perturbation and Time-Interleaved Capability Perturbation are introduced, especially in conditions where the standard CRL struggles to converge. Hardware tests indicate that the optimized lightweight network structure excels in weight loading and average inference time, meeting real-time control requirements.
comment: 63 pages, 32 figures
A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM
Vision-Language-Action (VLA) models are receiving increasing attention for their ability to enable robots to perform complex tasks by integrating visual context with linguistic commands. However, achieving efficient real-time performance remains challenging due to the high computational demands of existing models. To overcome this, we propose Dual Process VLA (DP-VLA), a hierarchical framework inspired by dual-process theory. DP-VLA utilizes a Large System 2 Model (L-Sys2) for complex reasoning and decision-making, while a Small System 1 Model (S-Sys1) handles real-time motor control and sensory processing. By leveraging Vision-Language Models (VLMs), the L-Sys2 operates at low frequencies, reducing computational overhead, while the S-Sys1 ensures fast and accurate task execution. Experimental results on the RoboCasa dataset demonstrate that DP-VLA achieves faster inference and higher task success rates, providing a scalable solution for advanced robotic applications.
comment: 10 page
Implicit Contact Diffuser: Sequential Contact Reasoning with Latent Point Cloud Diffusion
Long-horizon contact-rich manipulation has long been a challenging problem, as it requires reasoning over both discrete contact modes and continuous object motion. We introduce Implicit Contact Diffuser (ICD), a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment. This sequence is then used as guidance for an MPC method to accomplish a given task. The key advantage of this approach is that the latent descriptors provide more task-relevant guidance to MPC, helping to avoid local minima for contact-rich manipulation tasks. Our experiments demonstrate that ICD outperforms baselines on complex, long-horizon, contact-rich manipulation tasks, such as cable routing and notebook folding. Additionally, our experiments also indicate that \methodshort can generalize a target contact relationship to a different environment. More visualizations can be found on our website $\href{https://implicit-contact-diffuser.github.io/}{https://implicit-contact-diffuser.github.io}$
comment: In submussion
Caging in Time: A Framework for Robust Object Manipulation under Uncertainties and Limited Robot Perception
Real-world object manipulation has been commonly challenged by physical uncertainties and perception limitations. Being an effective strategy, while caging configuration-based manipulation frameworks have successfully provided robust solutions, they are not broadly applicable due to their strict requirements on the availability of multiple robots, widely distributed contacts, or specific geometries of the robots or the objects. To this end, this work proposes a novel concept, termed Caging in Time, to allow caging configurations to be formed even if there is just one robot engaged in a task. This novel concept can be explained by an insight that even if a caging configuration is needed to constrain the motion of an object, only a small portion of the cage is actively manipulating at a time. As such, we can switch the configuration of the robot strategically so that by collapsing its configuration in time, we will see a cage formed and its necessary portion active whenever needed. We instantiate our Caging in Time theory on challenging quasistatic and dynamic manipulation tasks, showing that Caging in Time can be achieved in general state spaces including geometry-based and energy-based spaces. With extensive experiments, we show robust and accurate manipulation, in an open-loop manner, without requiring detailed knowledge of the object geometry or physical properties, nor realtime accurate feedback on the manipulation states. In addition to being an effective and robust open-loop manipulation solution, the proposed theory can be a supplementary strategy to other manipulation systems affected by uncertain or limited robot perception.
comment: 24 pages, 25 figures, video available at: www.youtube.com/watch?v=Ag_jTzazuSM
Automated Planning Domain Inference for Task and Motion Planning
Task and motion planning (TAMP) frameworks address long and complex planning problems by integrating high-level task planners with low-level motion planners. However, existing TAMP methods rely heavily on the manual design of planning domains that specify the preconditions and postconditions of all high-level actions. This paper proposes a method to automate planning domain inference from a handful of test-time trajectory demonstrations, reducing the reliance on human design. Our approach incorporates a deep learning-based estimator that predicts the appropriate components of a domain for a new task and a search algorithm that refines this prediction, reducing the size and ensuring the utility of the inferred domain. Our method is able to generate new domains from minimal demonstrations at test time, enabling robots to handle complex tasks more efficiently. We demonstrate that our approach outperforms behavior cloning baselines, which directly imitate planner behavior, in terms of planning performance and generalization across a variety of tasks. Additionally, our method reduces computational costs and data amount requirements at test time for inferring new planning domains.
comment: 8 pages, 7 figures
Agent-Based Emulation for Deploying Robot Swarm Behaviors ICRA 2025
Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach by employing an Embodied Agent-Based Modeling and Simulation approach, emphasizing the use of simple robots and identifying conditions that naturally lead to self-organized collective behaviors. Using the Reality-to-Simulation-to-Reality for Swarms (RSRS) process, we tightly integrate real-world experiments with simulations to reproduce known swarm behaviors as well as discovering a novel emergent behavior without aiming to eliminate or even reduce the sim2real gap. This paper presents the development of an Agent-Based Embodiment and Emulation process that balances the importance of running physical swarming experiments and the prohibitively time-consuming process of even setting up and running a single experiment with 20+ robots by leveraging low-fidelity lightweight simulations to enable hypothesis-formation to guide physical experiments. We demonstrate the usefulness of our methods by emulating two known behaviors from the literature and show a third behavior `discovered' by accident.
comment: 8 pages, 6 figures, submitted to ICRA 2025
Policies with Sparse Inter-Agent Dependencies in Dynamic Games: A Dynamic Programming Approach
Common feedback strategies in multi-agent dynamic games require all players' state information to compute control strategies. However, in real-world scenarios, sensing and communication limitations between agents make full state feedback expensive or impractical, and such strategies can become fragile when state information from other agents is inaccurate. To this end, we propose a regularized dynamic programming approach for finding sparse feedback policies that selectively depend on the states of a subset of agents in dynamic games. The proposed approach solves convex adaptive group Lasso problems to compute sparse policies approximating Nash equilibrium solutions. We prove the regularized solutions' asymptotic convergence to a neighborhood of Nash equilibrium policies in linear-quadratic (LQ) games. We extend the proposed approach to general non-LQ games via an iterative algorithm. Empirical results in multi-robot interaction scenarios show that the proposed approach effectively computes feedback policies with varying sparsity levels. When agents have noisy observations of other agents' states, simulation results indicate that the proposed regularized policies consistently achieve lower costs than standard Nash equilibrium policies by up to 77% for all interacting agents whose costs are coupled with other agents' states.
Online Optimization of Central Pattern Generators for Quadruped Locomotion IROS2024
Typical legged locomotion controllers are designed or trained offline. This is in contrast to many animals, which are able to locomote at birth, and rapidly improve their locomotion skills with few real-world interactions. Such motor control is possible through oscillatory neural networks located in the spinal cord of vertebrates, known as Central Pattern Generators (CPGs). Models of the CPG have been widely used to generate locomotion skills in robotics, but can require extensive hand-tuning or offline optimization of inter-connected parameters with genetic algorithms. In this paper, we present a framework for the \textit{online} optimization of the CPG parameters through Bayesian Optimization. We show that our framework can rapidly optimize and adapt to varying velocity commands and changes in the terrain, for example to varying coefficients of friction, terrain slope angles, and added mass payloads placed on the robot. We study the effects of sensory feedback on the CPG, and find that both force feedback in the phase equations, as well as posture control (Virtual Model Control) are both beneficial for robot stability and energy efficiency. In hardware experiments on the Unitree Go1, we show rapid optimization (in under 3 minutes) and adaptation of energy-efficient gaits to varying target velocities in a variety of scenarios: varying coefficients of friction, added payloads up to 15 kg, and variable slopes up to 10 degrees. See demo at: https://youtu.be/4qq5leCI2AI
comment: Accepted by IROS2024
Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and Perspectives
Foundation models (FMs), large deep learning models pre-trained on vast, unlabeled datasets, exhibit powerful capabilities in understanding complex patterns and generating sophisticated outputs. However, they often struggle to adapt to specific tasks. Reinforcement learning (RL), which allows agents to learn through interaction and feedback, offers a compelling solution. Integrating RL with FMs enables these models to achieve desired outcomes and excel at particular tasks. Additionally, RL can be enhanced by leveraging the reasoning and generalization capabilities of FMs. This synergy is revolutionizing various fields, including robotics. FMs, rich in knowledge and generalization, provide robots with valuable information, while RL facilitates learning and adaptation through real-world interactions. This survey paper comprehensively explores this exciting intersection, examining how these paradigms can be integrated to advance robotic intelligence. We analyze the use of foundation models as action planners, the development of robotics-specific foundation models, and the mutual benefits of combining FMs with RL. Furthermore, we present a taxonomy of integration approaches, including large language models, vision-language models, diffusion models, and transformer-based RL models. We also explore how RL can utilize world representations learned from FMs to enhance robotic task execution. Our survey aims to synthesize current research and highlight key challenges in robotic reasoning and control, particularly in the context of integrating FMs and RL--two rapidly evolving technologies. By doing so, we seek to spark future research and emphasize critical areas that require further investigation to enhance robotics. We provide an updated collection of papers based on our taxonomy, accessible on our open-source project website at: https://github.com/clmoro/Robotics-RL-FMs-Integration.
comment: Submitted for publication to the Special Issue on Foundation Models and Neural-Symbolic AI for Robotics in The International Journal of Robotics Research (IJRR)
Magnetic Ball Chain Robots for Cardiac Arrhythmia Treatment
This paper introduces a novel magnetic navigation system for cardiac ablation. The system is formed from two key elements: a magnetic ablation catheter consisting of a chain of spherical permanent magnets; and an actuation system comprised of two cart-mounted permanent magnets undergoing pure rotation. The catheter design enables a large magnetic content with the goal of minimizing the footprint of the actuation system for easier integration with the clinical workflow. We present a quasi-static model of the catheter, the design of the actuation units, and their control modalities. Experimental validation shows that we can use small rotating magnets (119mm diameter) to reach cardiac ablation targets while generating clinically-relevant forces. Catheter control using a joystick is compared with manual catheter control. blue While total task completion time is similar, smoother navigation is observed using the proposed robotic system. We also demonstrate that the ball chain can ablate heart tissue and generate lesions comparable to the current clinical ablation catheters.
comment: in IEEE Transactions on Medical Robotics and Bionics, 2024
Knowledge Transfer from Simple to Complex: A Safe and Efficient Reinforcement Learning Framework for Autonomous Driving Decision-Making
A safe and efficient decision-making system is crucial for autonomous vehicles. However, the complexity and variability of driving environments limit the effectiveness of many rule-based and machine learning-based decision-making approaches. Reinforcement Learning in autonomous driving offers a promising solution to these challenges. Nevertheless, concerns regarding safety and efficiency during training remain major obstacles to its widespread application. To address these concerns, we propose a novel RL framework named Simple to Complex Collaborative Decision. First, we rapidly train the teacher model using the Proximal Policy Optimization algorithm in a lightweight simulation environment. In the more intricate simulation environment, the teacher model intervenes when the student agent exhibits suboptimal behavior by assessing the value of actions to avert dangerous situations. We also introduce an innovative RL algorithm called Adaptive Clipping PPO, which is trained using a combination of samples generated by both teacher and student policies, and employs dynamic clipping strategies based on sample importance. Additionally, we employ the KL divergence as a constraint on policy optimization, transforming it into an unconstrained problem to accelerate the student's learning of the teacher's policy. Finally, a gradual weaning strategy is employed to ensure that, over time, the student agent learns to explore independently. Simulation experiments in highway lane-change scenarios demonstrate that the S2CD framework enhances learning efficiency, reduces training costs, and significantly improves safety during training when compared with state-of-the-art baseline algorithms. This approach also ensures effective knowledge transfer between teacher and student models, and even when the teacher model is suboptimal.
Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling
Predicting and executing a sequence of actions without intermediate replanning, known as action chunking, is increasingly used in robot learning from human demonstrations. Yet, its reported effects on the learned policy are inconsistent: some studies find it crucial for achieving strong results, while others observe decreased performance. In this paper, we first dissect how action chunking impacts the divergence between a learner and a demonstrator. We find that action chunking allows the learner to better capture the temporal dependencies in demonstrations but at the cost of reduced reactivity in stochastic environments. To address this tradeoff, we propose Bidirectional Decoding (BID), a test-time inference algorithm that bridges action chunking with closed-loop operations. BID samples multiple predictions at each time step and searches for the optimal one based on two criteria: (i) backward coherence, which favors samples that align with previous decisions; (ii) forward contrast, which seeks samples of high likelihood for future plans. By coupling decisions within and across action chunks, BID promotes consistency over time while maintaining reactivity to unexpected changes. Experimental results show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks. Code and videos are available at https://bid-robot.github.io.
comment: Project website: https://bid-robot.github.io/
Data Efficient Behavior Cloning for Fine Manipulation via Continuity-based Corrective Labels IROS 2024
We consider imitation learning with access only to expert demonstrations, whose real-world application is often limited by covariate shift due to compounding errors during execution. We investigate the effectiveness of the Continuity-based Corrective Labels for Imitation Learning (CCIL) framework in mitigating this issue for real-world fine manipulation tasks. CCIL generates corrective labels by learning a locally continuous dynamics model from demonstrations to guide the agent back toward expert states. Through extensive experiments on peg insertion and fine grasping, we provide the first empirical validation that CCIL can significantly improve imitation learning performance despite discontinuities present in contact-rich manipulation. We find that: (1) real-world manipulation exhibits sufficient local smoothness to apply CCIL, (2) generated corrective labels are most beneficial in low-data regimes, and (3) label filtering based on estimated dynamics model error enables performance gains. To effectively apply CCIL to robotic domains, we offer a practical instantiation of the framework and insights into design choices and hyperparameter selection. Our work demonstrates CCIL's practicality for alleviating compounding errors in imitation learning on physical robots.
comment: Presented at IROS 2024
A Lyapunov-Based Switching Scheme for Selecting the Stable Closed-Loop Fixed Attitude-Error Quaternion During Flight
We present a switching scheme, which uses both the attitude-error quaternion (AEQ) and the angular-velocity error, for controlling the rotational degrees of freedom of an uncrewed aerial vehicle (UAV) during flight. In this approach, the proposed controller continually selects the stable closed-loop (CL) equilibrium AEQ corresponding to the smallest cost between those computed with two energy-based Lyapunov functions. To analyze and enforce the stability of the CL switching dynamics, we use basic nonlinear theory. This research problem is relevant because the selection of the stable CL equilibrium AEQ directly determines the power and energy requirements of the controlled UAV during flight. To test and demonstrate the implementation, suitability, functionality, and performance of the proposed approach, we present experimental results obtained using a 31-gram quadrotor, which was controlled to execute high-speed yaw maneuvers in flight. These flight tests show that the proposed switching controller can respectively reduce the control effort and rotational power by as much as 49.75 % and 28.14 %, on average, compared to those corresponding to an often-used benchmark controller.
comment: 8 pages, 5 figures, 2024 7th Iberian Robotics Conference (ROBOT)
Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition
Employing a teleoperation system for gathering demonstrations offers the potential for more efficient learning of robot manipulation. However, teleoperating a robot arm equipped with a dexterous hand or gripper, via a teleoperation system presents inherent challenges due to the task's high dimensionality, complexity of motion, and differences between physiological structures. In this study, we introduce a novel system for joint learning between human operators and robots, that enables human operators to share control of a robot end-effector with a learned assistive agent, simplifies the data collection process, and facilitates simultaneous human demonstration collection and robot manipulation training. As data accumulates, the assistive agent gradually learns. Consequently, less human effort and attention are required, enhancing the efficiency of the data collection process. It also allows the human operator to adjust the control ratio to achieve a trade-off between manual and automated control. We conducted experiments in both simulated environments and physical real-world settings. Through user studies and quantitative evaluations, it is evident that the proposed system could enhance data collection efficiency and reduce the need for human adaptation while ensuring the collected data is of sufficient quality for downstream tasks. \textit{For more details, please refer to our webpage https://norweig1an.github.io/HAJL.github.io/.
comment: 8 pages, 6 figures
Collaborative Goal Tracking of Multiple Mobile Robots Based on Geometric Graph Neural Network
Multiple mobile robots play a significant role in various spatially distributed tasks, highlighting the importance of collaborative path planning to enhance operational efficiency. In unfamiliar and non-repetitive scenarios, reconstructing the global map can be time-inefficient and sometimes unrealistic. Therefore, research has focused on achieving real-time collaborative planning by utilizing sensor data from multiple robots located at different positions, without relying on a global map. This paper introduces a Multi-Robot Collaborative Path Planning method based on a Geometric Graph Neural Network (MRPP-GeoGNN). First, the features of each neighboring robot's sensory data are extracted, and the relative positions of neighboring robots are integrated into each interaction layer to incorporate obstacle information along with location details. Subsequently, GeoGNN maps the amalgamated local environment features to multiple forward directions for the robot's actual movement. An expert data generation method is devised for the robot to advance step by step in the physical environment, generating different expert data in ROS to train the network. We conducted both simulations and physical experiments to validate the effectiveness of the proposed method. Simulation results demonstrate approximately a 5% improvement in accuracy compared to the model based solely on CNN using expert datasets. In the ROS simulation test, the success rate is enhanced by about 4% compared to CNN, and the flow time increase is reduced by approximately 8%, surpassing other GNN models. The physical experimental results indicate that the proposed method enables the robot to navigate successfully in the actual environment and achieve the shortest average path length compared to the benchmark method.
Optimizing BioTac Simulation for Realistic Tactile Perception IJCNN
Tactile sensing presents a promising opportunity for enhancing the interaction capabilities of today's robots. BioTac is a commonly used tactile sensor that enables robots to perceive and respond to physical tactile stimuli. However, the sensor's non-linearity poses challenges in simulating its behavior. In this paper, we first investigate a BioTac simulation that uses temperature, force, and contact point positions to predict the sensor outputs. We show that training with BioTac temperature readings does not yield accurate sensor output predictions during deployment. Consequently, we tested three alternative models, i.e., an XGBoost regressor, a neural network, and a transformer encoder. We train these models without temperature readings and provide a detailed investigation of the window size of the input vectors. We demonstrate that we achieve statistically significant improvements over the baseline network. Furthermore, our results reveal that the XGBoost regressor and transformer outperform traditional feed-forward neural networks in this task. We make all our code and results available online on https://github.com/wzaielamri/Optimizing_BioTac_Simulation.
comment: 12 pages (including appendix), Accepted at the International Joint Conference on Neural Network (IJCNN) 2024, Yokohama, Japan. \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media... (We refer to IEEE Copyrights)
OmniRace: 6D Hand Pose Estimation for Intuitive Guidance of Racing Drone
This paper presents the OmniRace approach to controlling a racing drone with 6-degree of freedom (DoF) hand pose estimation and gesture recognition. To our knowledge, it is the first-ever technology that allows for low-level control of high-speed drones using gestures. OmniRace employs a gesture interface based on computer vision and a deep neural network to estimate a 6-DoF hand pose. The advanced machine learning algorithm robustly interprets human gestures, allowing users to control drone motion intuitively. Real-time control of a racing drone demonstrates the effectiveness of the system, validating its potential to revolutionize drone racing and other applications. Experimental results conducted in the Gazebo simulation environment revealed that OmniRace allows the users to complite the UAV race track significantly (by 25.1%) faster and to decrease the length of the test drone path (from 102.9 to 83.7 m). Users preferred the gesture interface for attractiveness (1.57 UEQ score), hedonic quality (1.56 UEQ score), and lower perceived temporal demand (32.0 score in NASA-TLX), while noting the high efficiency (0.75 UEQ score) and low physical demand (19.0 score in NASA-TLX) of the baseline remote controller. The deep neural network attains an average accuracy of 99.75% when applied to both normalized datasets and raw datasets. OmniRace can potentially change the way humans interact with and navigate racing drones in dynamic and complex environments. The source code is available at https://github.com/SerValera/OmniRace.git.
SLR: Learning Quadruped Locomotion without Privileged Information
The recent mainstream reinforcement learning control for quadruped robots often relies on privileged information, demanding meticulous selection and precise estimation, thereby imposing constraints on the development process. This work proposes a Self-learning Latent Representation (SLR) method, which achieves high-performance control policy learning without the need for privileged information. To enhance the credibility of the proposed method's evaluation, SLR was directly compared with state-of-the-art algorithms using their open-source code repositories and original configuration parameters. Remarkably, SLR surpasses the performance of previous methods using only limited proprioceptive data, demonstrating significant potential for future applications. Ultimately, the trained policy and encoder empower the quadruped robot to traverse various challenging terrains. Videos of our results can be found on our website: https://11chens.github.io/SLR/
Bimanual Deformable Bag Manipulation Using a Structure-of-Interest Based Neural Dynamics Model
The manipulation of deformable objects by robotic systems presents a significant challenge due to their complex and infinite-dimensional configuration spaces. This paper introduces a novel approach to Deformable Object Manipulation (DOM) by emphasizing the identification and manipulation of Structures of Interest (SOIs) in deformable fabric bags. We propose a bimanual manipulation framework that leverages a Graph Neural Network (GNN)-based latent dynamics model to succinctly represent and predict the behavior of these SOIs. Our approach involves constructing a graph representation from partial point cloud data of the object and learning the latent dynamics model that effectively captures the essential deformations of the fabric bag within a reduced computational space. By integrating this latent dynamics model with Model Predictive Control (MPC), we empower robotic manipulators to perform precise and stable manipulation tasks focused on the SOIs. We have validated our framework through various empirical experiments demonstrating its efficacy in bimanual manipulation of fabric bags. Our contributions not only address the complexities inherent in DOM but also provide new perspectives and methodologies for enhancing robotic interactions with deformable objects by concentrating on their critical structural elements. Experimental videos can be obtained from https://sites.google.com/view/bagbot.
UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps
In this study, we address a gap in existing unsupervised domain adaptation approaches on LiDAR-based 3D object detection, which have predominantly concentrated on adapting between established, high-density autonomous driving datasets. We focus on sparser point clouds, capturing scenarios from different perspectives: not just from vehicles on the road but also from mobile robots on sidewalks, which encounter significantly different environmental conditions and sensor configurations. We introduce Unsupervised Adversarial Domain Adaptation for 3D Object Detection (UADA3D). UADA3D does not depend on pre-trained source models or teacher-student architectures. Instead, it uses an adversarial approach to directly learn domain-invariant features. We demonstrate its efficacy in various adaptation scenarios, showing significant improvements in both self-driving car and mobile robot domains. Our code is open-source and will be available soon.
comment: Accepted for IEEE RA-L 2024
The Art of Imitation: Learning Long-Horizon Manipulation Tasks from Few Demonstrations
Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for learning object-centric robot manipulation tasks. However, there are several open challenges to applying TP-GMMs in the wild. In this work, we tackle three crucial challenges synergistically. First, end-effector velocities are non-Euclidean and thus hard to model using standard GMMs. We thus propose to factorize the robot's end-effector velocity into its direction and magnitude, and model them using Riemannian GMMs. Second, we leverage the factorized velocities to segment and sequence skills from complex demonstration trajectories. Through the segmentation, we further align skill trajectories and hence leverage time as a powerful inductive bias. Third, we present a method to automatically detect relevant task parameters per skill from visual observations. Our approach enables learning complex manipulation tasks from just five demonstrations while using only RGB-D observations. Extensive experimental evaluations on RLBench demonstrate that our approach achieves state-of-the-art performance with 20-fold improved sample efficiency. Our policies generalize across different environments, object instances, and object positions, while the learned skills are reusable.
Flow Matching Imitation Learning for Multi-Support Manipulation
Humanoid robots could benefit from using their upper bodies for support contacts, enhancing their workspace, stability, and ability to perform contact-rich and pushing tasks. In this paper, we propose a unified approach that combines an optimization-based multi-contact whole-body controller with Flow Matching, a recently introduced method capable of generating multi-modal trajectory distributions for imitation learning. In simulation, we show that Flow Matching is more appropriate for robotics than Diffusion and traditional behavior cloning. On a real full-size humanoid robot (Talos), we demonstrate that our approach can learn a whole-body non-prehensile box-pushing task and that the robot can close dishwasher drawers by adding contacts with its free hand when needed for balance. We also introduce a shared autonomy mode for assisted teleoperation, providing automatic contact placement for tasks not covered in the demonstrations. Full experimental videos are available at: https://hucebot.github.io/flow_multisupport_website/
comment: 2024 IEEE-RAS 23rd International Conference on Humanoid Robots (Humanoids), Nov 2024, Nancy, France
ATI-CTLO:Adaptive Temporal Interval-based Continuous-Time LiDAR-Only Odometry
The motion distortion in LiDAR scans caused by aggressive robot motion and varying terrain features significantly impacts the positioning and mapping performance of 3D LiDAR odometry. Existing distortion correction solutions often struggle to balance computational complexity and accuracy. In this work, we propose an Adaptive Temporal Interval-based Continuous-Time LiDAR-only Odometry, utilizing straightforward and efficient linear interpolation. Our method flexibly adjusts the temporal intervals between control nodes according to the dynamics of motion and environmental characteristics. This adaptability enhances performance across various motion states and improves robustness in challenging, feature-sparse environments. We validate the effectiveness of our method on multiple datasets across different platforms, achieving accuracy comparable to state-of-the-art LiDAR-only odometry methods. Notably, in scenarios involving aggressive motion and sparse features, our method outperforms existing solutions.
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers
Large Vision-Language-Action (VLA) models, leveraging powerful pre trained Vision-Language Models (VLMs) backends, have shown promise in robotic control due to their impressive generalization ability. However, the success comes at a cost. Their reliance on VLM backends with billions of parameters leads to high computational costs and inference latency, limiting the testing scenarios to mainly quasi-static tasks and hindering performance in dynamic tasks requiring rapid interactions. To address these limitations, this paper proposes HiRT, a Hierarchical Robot Transformer framework that enables flexible frequency and performance trade-off. HiRT keeps VLMs running at low frequencies to capture temporarily invariant features while enabling real-time interaction through a high-frequency vision-based policy guided by the slowly updated features. Experiment results in both simulation and real-world settings demonstrate significant improvements over baseline methods. Empirically, in static tasks, we double the control frequency and achieve comparable success rates. Additionally, on novel real-world dynamic ma nipulation tasks which are challenging for previous VLA models, HiRT improves the success rate from 48% to 75%.
A New Framework for Nonlinear Kalman Filters
The Kalman filter (KF) is a state estimation algorithm that optimally combines system knowledge and measurements to minimize the mean squared error of the estimated states. While KF was initially designed for linear systems, numerous extensions of it, such as extended Kalman filter (EKF), unscented Kalman filter (UKF), cubature Kalman filter (CKF), etc., have been proposed for nonlinear systems. Although different types of nonlinear KFs have different pros and cons, they all use the same framework of linear KF, which, according to what we found in this paper, tends to give overconfident and less accurate state estimations when the measurement functions are nonlinear. Therefore, in this study, we designed a new framework for nonlinear KFs and showed theoretically and empirically that the new framework estimates the states and covariance matrix more accurately than the old one. The new framework was tested on four different nonlinear KFs and five different tasks, showcasing its ability to reduce the estimation errors by several orders of magnitude in low-measurement-noise conditions, with only about a 10 to 90% increase in computational time. All types of nonlinear KFs can benefit from the new framework, and the benefit will increase as the sensors become more and more accurate in the future. As an example, EKF, the simplest nonlinear KF that was previously believed to work poorly for strongly nonlinear systems, can now provide fast and fairly accurate state estimations with the help of the new framework. The codes are available at https://github.com/Shida-Jiang/A-new-framework-for-nonlinear-Kalman-filters.
comment: Some typo fixed
LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation
This paper presents LiteVLoc, a hierarchical visual localization framework that uses a lightweight topo-metric map to represent the environment. The method consists of three sequential modules that estimate camera poses in a coarse-to-fine manner. Unlike mainstream approaches relying on detailed 3D representations, LiteVLoc reduces storage overhead by leveraging learning-based feature matching and geometric solvers for metric pose estimation. A novel dataset for the map-free relocalization task is also introduced. Extensive experiments including localization and navigation in both simulated and real-world scenarios have validate the system's performance and demonstrated its precision and efficiency for large-scale deployment. Code and data will be made publicly available.
comment: 9 pages, 4 figures
Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model MICCAI2024
Echocardiography is the only technique capable of real-time imaging of the heart and is vital for diagnosing the majority of cardiac diseases. However, there is a severe shortage of experienced cardiac sonographers, due to the heart's complex structure and significant operational challenges. To mitigate this situation, we present a Cardiac Copilot system capable of providing real-time probe movement guidance to assist less experienced sonographers in conducting freehand echocardiography. This system can enable non-experts, especially in primary departments and medically underserved areas, to perform cardiac ultrasound examinations, potentially improving global healthcare delivery. The core innovation lies in proposing a data-driven world model, named Cardiac Dreamer, for representing cardiac spatial structures. This world model can provide structure features of any cardiac planes around the current probe position in the latent space, serving as an precise navigation map for autonomous plane localization. We train our model with real-world ultrasound data and corresponding probe motion from 110 routine clinical scans with 151K sample pairs by three certified sonographers. Evaluations on three standard planes with 37K sample pairs demonstrate that the world model can reduce navigation errors by up to 33\% and exhibit more stable performance.
comment: Accepted by MICCAI2024
Data-Driven Dynamics Modeling of Miniature Robotic Blimps Using Neural ODEs With Parameter Auto-Tuning
Miniature robotic blimps, as one type of lighter-than-air aerial vehicles, have attracted increasing attention in the science and engineering community for their enhanced safety, extended endurance, and quieter operation compared to quadrotors. Accurately modeling the dynamics of these robotic blimps poses a significant challenge due to the complex aerodynamics stemming from their large lifting bodies. Traditional first-principle models have difficulty obtaining accurate aerodynamic parameters and often overlook high-order nonlinearities, thus coming to its limit in modeling the motion dynamics of miniature robotic blimps. To tackle this challenge, this letter proposes the Auto-tuning Blimp-oriented Neural Ordinary Differential Equation method (ABNODE), a data-driven approach that integrates first-principle and neural network modeling. Spiraling motion experiments of robotic blimps are conducted, comparing the ABNODE with first-principle and other data-driven benchmark models, the results of which demonstrate the effectiveness of the proposed method.
comment: 8 pages, 8 figures
Trust or Bust: Ensuring Trustworthiness in Autonomous Weapon Systems
The integration of Autonomous Weapon Systems (AWS) into military operations presents both significant opportunities and challenges. This paper explores the multifaceted nature of trust in AWS, emphasising the necessity of establishing reliable and transparent systems to mitigate risks associated with bias, operational failures, and accountability. Despite advancements in Artificial Intelligence (AI), the trustworthiness of these systems, especially in high-stakes military applications, remains a critical issue. Through a systematic review of existing literature, this research identifies gaps in the understanding of trust dynamics during the development and deployment phases of AWS. It advocates for a collaborative approach that includes technologists, ethicists, and military strategists to address these ongoing challenges. The findings underscore the importance of Human-Machine teaming and enhancing system intelligibility to ensure accountability and adherence to International Humanitarian Law. Ultimately, this paper aims to contribute to the ongoing discourse on the ethical implications of AWS and the imperative for trustworthy AI in defense contexts.
comment: Accepted as a workshop paper at MILCOM 2024, 8 pages
Uncovering the Secrets of Human-Like Movement: A Fresh Perspective on Motion Planning
This article explores human-like movement from a fresh perspective on motion planning. We analyze the coordinated and compliant movement mechanisms of the human body from the perspective of biomechanics. Based on these mechanisms, we propose an optimal control framework that integrates compliant control dynamics, optimizing robotic arm motion through a response time matrix. This matrix sets the timing parameters for joint movements, turning the system into a time-parameterized optimal control problem. The model focuses on the interaction between active and passive joints under external disturbances, improving adaptability and compliance. This method achieves optimal trajectory generation and balances precision and compliance. Experimental results on both a manipulator and a humanoid robot validate the approach.
comment: 7 pages
MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation ICRA 2024
Depth perception is crucial for a wide range of robotic applications. Multi-frame self-supervised depth estimation methods have gained research interest due to their ability to leverage large-scale, unlabeled real-world data. However, the self-supervised methods often rely on the assumption of a static scene and their performance tends to degrade in dynamic environments. To address this issue, we present Motion-Aware Loss, which leverages the temporal relation among consecutive input frames and a novel distillation scheme between the teacher and student networks in the multi-frame self-supervised depth estimation methods. Specifically, we associate the spatial locations of moving objects with the temporal order of input frames to eliminate errors induced by object motion. Meanwhile, we enhance the original distillation scheme in multi-frame methods to better exploit the knowledge from a teacher network. MAL is a novel, plug-and-play module designed for seamless integration into multi-frame self-supervised monocular depth estimation methods. Adding MAL into previous state-of-the-art methods leads to a reduction in depth estimation errors by up to 4.2% and 10.8% on KITTI and CityScapes benchmarks, respectively.
comment: Accepted by ICRA 2024; Project homepage: https://yuejiangdong.github.io/MotionAwareLoss/
Behavior-Inspired Neural Networks for Relational Inference
From pedestrians to Kuramoto oscillators, interactions between agents govern how a multitude of dynamical systems evolve in space and time. Discovering how these agents relate to each other can improve our understanding of the often complex dynamics that underlie these systems. Recent works learn to categorize relationships between agents based on observations of their physical behavior. These approaches are limited in that the relationship categories are modelled as outcomes of categorical distribution, when in real world systems categories often intermingle and interact. In this work, we introduce a level of abstraction between the observable behavior of agents and the latent categories that determine their behavior. To do this, we learn a mapping from agent behavior to agent preferences for each latent category in a graph neural network. We integrate the physical proximity of agents and their preferences in a nonlinear opinion dynamics model which provides a mechanism to identify mutually exclusive latent categories, predict an agent's evolution in time, and control an agent's physical behavior. We demonstrate the utility of our model for learning interpretable categories, and its efficacy on long-horizon prediction across several benchmarks where we outperform existing methods.
Tactile Displays Driven by Projected Light
Tactile displays that lend tangible form to digital content could transform computing interactions. However, achieving the resolution, speed, and dynamic range needed for perceptual fidelity remains challenging. We present a tactile display that directly converts projected light into visible tactile patterns via a photomechanical surface populated with millimeter-scale optotactile pixels. The pixels transduce incident light into mechanical displacements through photostimulated thermal gas expansion, yielding millimeter scale displacements with response times of 2 to 100 milliseconds. Employing projected light for power transmission and addressing renders these displays highly scalable. We demonstrate devices with up to 1511 addressable pixels. Perceptual studies confirm that they can reproduce diverse spatiotemporal tactile patterns with high fidelity. This research establishes a foundation for practical, versatile high-resolution tactile displays driven by light.
Using Fiber Optic Bundles to Miniaturize Vision-Based Tactile Sensors
Vision-based tactile sensors have recently become popular due to their combination of low cost, very high spatial resolution, and ease of integration using widely available miniature cameras. The associated field of view and focal length, however, are difficult to package in a human-sized finger. In this paper we employ optical fiber bundles to achieve a form factor that, at 15 mm diameter, is smaller than an average human fingertip. The electronics and camera are also located remotely, further reducing package size. The sensor achieves a spatial resolution of 0.22 mm and a minimum force resolution 5 mN for normal and shear contact forces. With these attributes, the DIGIT Pinki sensor is suitable for applications such as robotic and teleoperated digital palpation. We demonstrate its utility for palpation of the prostate gland and show that it can achieve clinically relevant discrimination of prostate stiffness for phantom and ex vivo tissue.
comment: This work has been submitted to the IEEE for possible publication. The CAD design files of DIGIT Pinki are available at https://github.com/facebookresearch/digit-design
Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
Multiagent Systems
IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems
As large language model (LLM) agents increasingly integrate into our infrastructure, their robust coordination and message synchronization become vital. The Byzantine Generals Problem (BGP) is a critical model for constructing resilient multi-agent systems (MAS) under adversarial attacks. It describes a scenario where malicious agents with unknown identities exist in the system-situations that, in our context, could result from LLM agents' hallucinations or external attacks. In BGP, the objective of the entire system is to reach a consensus on the action to be taken. Traditional BGP requires global consensus among all agents; however, in practical scenarios, global consensus is not always necessary and can even be inefficient. Therefore, there is a pressing need to explore a refined version of BGP that aligns with the local coordination patterns observed in MAS. We refer to this refined version as Imperfect BGP (IBGP) in our research, aiming to address this discrepancy. To tackle this issue, we propose a framework that leverages consensus protocols within general MAS settings, providing provable resilience against communication attacks and adaptability to changing environments, as validated by empirical results. Additionally, we present a case study in a sensor network environment to illustrate the practical application of our protocol.
LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation
Autonomous Driving Systems (ADS) require diverse and safety-critical traffic scenarios for effective training and testing, but the existing data generation methods struggle to provide flexibility and scalability. We propose LASER, a novel frame-work that leverage large language models (LLMs) to conduct traffic simulations based on natural language inputs. The framework operates in two stages: it first generates scripts from user-provided descriptions and then executes them using autonomous agents in real time. Validated in the CARLA simulator, LASER successfully generates complex, on-demand driving scenarios, significantly improving ADS training and testing data generation.
Spiking Neural Networks as a Controller for Emergent Swarm Agents
Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible emergent behaviors in swarms of robots with only a binary sensor and a simple but hand-picked controller structure. Even agents in this highly limited sensing, actuation, and computational capability class can exhibit relatively complex global behaviors such as aggregation, milling, and dispersal, but finding the local interaction rules that enable more collective behaviors remains a significant challenge. This paper investigates the feasibility of training spiking neural networks to find those local interaction rules that result in particular emergent behaviors. In this paper, we focus on simulating a specific milling behavior already known to be producible using very simple binary sensing and acting agents. To do this, we use evolutionary algorithms to evolve not only the parameters (the weights, biases, and delays) of a spiking neural network, but also its structure. To create a baseline, we also show an evolutionary search strategy over the parameters for the incumbent hand-picked binary controller structure. Our simulations show that spiking neural networks can be evolved in binary sensing agents to form a mill.
comment: 8 pages, 7 figures, presented at the 2024 International Conference on Neuromorphic Systems
Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning
Training LLMs presents significant memory challenges due to growing size of data, weights, and optimizer states. Techniques such as data and model parallelism, gradient checkpointing, and offloading strategies address this issue but are often infeasible due to hardware constraints. To mitigate memory usage, alternative methods like Parameter-Efficient-Fine-Tuning (PEFT) and GaLore approximate weights or optimizer states. PEFT methods, such as LoRA, have gained popularity for fine-tuning LLMs, though they require a full-rank warm start. In contrast, GaLore allows full-parameter learning while being more memory-efficient. This work introduces Natural GaLore, a simple drop in replacement for AdamW, which efficiently applies the inverse Empirical Fisher Information Matrix to low-rank gradients using Woodbury's Identity. We demonstrate that incorporating second-order information speeds up optimization significantly, especially when the iteration budget is limited. Empirical pretraining on 60M, 130M, 350M, and 1.1B parameter Llama models on C4 data demonstrate significantly lower perplexity over GaLore without additional memory overhead. By fine-tuning RoBERTa on the GLUE benchmark using Natural GaLore, we demonstrate significant reduction in gap 86.05% vs 86.28% for full-finetuning. Furthermore, fine-tuning the TinyLlama 1.1B model for function calling using the TinyAgent framework shows that Natural GaLore achieving 83.09% accuracy on the TinyAgent dataset, significantly outperforms 16-bit LoRA at 80.06% and even surpasses GPT4-Turbo by 4%, all while using 30% less memory. All code to reproduce the results are available at: https://github.com/selfsupervised-ai/Natural-GaLore.git
comment: 10 pages, 3 tables, 3 figures
Analyzing Closed-loop Training Techniques for Realistic Traffic Agent Models in Autonomous Highway Driving Simulations
Simulation plays a crucial role in the rapid development and safe deployment of autonomous vehicles. Realistic traffic agent models are indispensable for bridging the gap between simulation and the real world. Many existing approaches for imitating human behavior are based on learning from demonstration. However, these approaches are often constrained by focusing on individual training strategies. Therefore, to foster a broader understanding of realistic traffic agent modeling, in this paper, we provide an extensive comparative analysis of different training principles, with a focus on closed-loop methods for highway driving simulation. We experimentally compare (i) open-loop vs. closed-loop multi-agent training, (ii) adversarial vs. deterministic supervised training, (iii) the impact of reinforcement losses, and (iv) the impact of training alongside log-replayed agents to identify suitable training techniques for realistic agent modeling. Furthermore, we identify promising combinations of different closed-loop training methods.
comment: 15 pages, 6 figures, 4 tables
FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL NeurIPS '24
Multi-agent reinforcement learning has demonstrated significant potential in addressing complex cooperative tasks across various real-world applications. However, existing MARL approaches often rely on the restrictive assumption that the number of entities (e.g., agents, obstacles) remains constant between training and inference. This overlooks scenarios where entities are dynamically removed or added during the inference trajectory -- a common occurrence in real-world environments like search and rescue missions and dynamic combat situations. In this paper, we tackle the challenge of intra-trajectory dynamic entity composition under zero-shot out-of-domain (OOD) generalization, where such dynamic changes cannot be anticipated beforehand. Our empirical studies reveal that existing MARL methods suffer significant performance degradation and increased uncertainty in these scenarios. In response, we propose FlickerFusion, a novel OOD generalization method that acts as a universally applicable augmentation technique for MARL backbone methods. Our results show that FlickerFusion not only achieves superior inference rewards but also uniquely reduces uncertainty vis-\`a-vis the backbone, compared to existing methods. For standardized evaluation, we introduce MPEv2, an enhanced version of Multi Particle Environments (MPE), consisting of 12 benchmarks. Benchmarks, implementations, and trained models are organized and open-sourced at flickerfusion305.github.io, accompanied by ample demo video renderings.
comment: NeurIPS '24 Open-World Agents Workshop
Towards Efficient Collaboration via Graph Modeling in Reinforcement Learning
In multi-agent reinforcement learning, a commonly considered paradigm is centralized training with decentralized execution. However, in this framework, decentralized execution restricts the development of coordinated policies due to the local observation limitation. In this paper, we consider the cooperation among neighboring agents during execution and formulate their interactions as a graph. Thus, we introduce a novel encoder-decoder architecture named Factor-based Multi-Agent Transformer ($f$-MAT) that utilizes a transformer to enable the communication between neighboring agents during both training and execution. By dividing agents into different overlapping groups and representing each group with a factor, $f$-MAT fulfills efficient message passing among agents through factor-based attention layers. Empirical results on networked systems such as traffic scheduling and power control demonstrate that $f$-MAT achieves superior performance compared to strong baselines, thereby paving the way for handling complex collaborative problems.
NetSafe: Exploring the Topological Safety of Multi-agent Networks
Large language models (LLMs) have empowered nodes within multi-agent networks with intelligence, showing growing applications in both academia and industry. However, how to prevent these networks from generating malicious information remains unexplored with previous research on single LLM's safety be challenging to transfer. In this paper, we focus on the safety of multi-agent networks from a topological perspective, investigating which topological properties contribute to safer networks. To this end, we propose a general framework, NetSafe along with an iterative RelCom interaction to unify existing diverse LLM-based agent frameworks, laying the foundation for generalized topological safety research. We identify several critical phenomena when multi-agent networks are exposed to attacks involving misinformation, bias, and harmful information, termed as Agent Hallucination and Aggregation Safety. Furthermore, we find that highly connected networks are more susceptible to the spread of adversarial attacks, with task performance in a Star Graph Topology decreasing by 29.7%. Besides, our proposed static metrics aligned more closely with real-world dynamic evaluations than traditional graph-theoretic metrics, indicating that networks with greater average distances from attackers exhibit enhanced safety. In conclusion, our work introduces a new topological perspective on the safety of LLM-based multi-agent networks and discovers several unreported phenomena, paving the way for future research to explore the safety of such networks.
Distributed Online Life-Long Learning (DOL3) for Multi-agent Trust and Reputation Assessment in E-commerce
Trust and Reputation Assessment of service providers in citizen-focused environments like e-commerce is vital to maintain the integrity of the interactions among agents. The goals and objectives of both the service provider and service consumer agents are relevant to the goals of the respective citizens (end users). The provider agents often pursue selfish goals that can make the service quality highly volatile, contributing towards the non-stationary nature of the environment. The number of active service providers tends to change over time resulting in an open environment. This necessitates a rapid and continual assessment of the Trust and Reputation. A large number of service providers in the environment require a distributed multi-agent Trust and Reputation assessment. This paper addresses the problem of multi-agent Trust and Reputation Assessment in a non-stationary environment involving transactions between providers and consumers. In this setting, the observer agents carry out the assessment and communicate their assessed trust scores with each other over a network. We propose a novel Distributed Online Life-Long Learning (DOL3) algorithm that involves real-time rapid learning of trust and reputation scores of providers. Each observer carries out an adaptive learning and weighted fusion process combining their own assessment along with that of their neighbour in the communication network. Simulation studies reveal that the state-of-the-art methods, which usually involve training a model to assess an agent's trust and reputation, do not work well in such an environment. The simulation results show that the proposed DOL3 algorithm outperforms these methods and effectively handles the volatility in such environments. From the statistical evaluation, it is evident that DOL3 performs better compared to other models in 90% of the cases.
Beyond Browsing: API-Based Web Agents
Web browsers are a portal to the internet, where much of human activity is undertaken. Thus, there has been significant research work in AI agents that interact with the internet through web browsing. However, there is also another interface designed specifically for machine interaction with online content: application programming interfaces (APIs). In this paper we ask -- what if we were to take tasks traditionally tackled by browsing agents, and give AI agents access to APIs? To do so, we propose two varieties of agents: (1) an API-calling agent that attempts to perform online tasks through APIs only, similar to traditional coding agents, and (2) a Hybrid Agent that can interact with online data through both web browsing and APIs. In experiments on WebArena, a widely-used and realistic benchmark for web navigation tasks, we find that API-based agents outperform web browsing agents. Hybrid Agents out-perform both others nearly uniformly across tasks, resulting in a more than 20.0% absolute improvement over web browsing alone, achieving a success rate of 35.8%, achiving the SOTA performance among task-agnostic agents. These results strongly suggest that when APIs are available, they present an attractive alternative to relying on web browsing alone.
comment: 24 pages, 6 figures
Policies with Sparse Inter-Agent Dependencies in Dynamic Games: A Dynamic Programming Approach
Common feedback strategies in multi-agent dynamic games require all players' state information to compute control strategies. However, in real-world scenarios, sensing and communication limitations between agents make full state feedback expensive or impractical, and such strategies can become fragile when state information from other agents is inaccurate. To this end, we propose a regularized dynamic programming approach for finding sparse feedback policies that selectively depend on the states of a subset of agents in dynamic games. The proposed approach solves convex adaptive group Lasso problems to compute sparse policies approximating Nash equilibrium solutions. We prove the regularized solutions' asymptotic convergence to a neighborhood of Nash equilibrium policies in linear-quadratic (LQ) games. We extend the proposed approach to general non-LQ games via an iterative algorithm. Empirical results in multi-robot interaction scenarios show that the proposed approach effectively computes feedback policies with varying sparsity levels. When agents have noisy observations of other agents' states, simulation results indicate that the proposed regularized policies consistently achieve lower costs than standard Nash equilibrium policies by up to 77% for all interacting agents whose costs are coupled with other agents' states.
Collaborative Goal Tracking of Multiple Mobile Robots Based on Geometric Graph Neural Network
Multiple mobile robots play a significant role in various spatially distributed tasks, highlighting the importance of collaborative path planning to enhance operational efficiency. In unfamiliar and non-repetitive scenarios, reconstructing the global map can be time-inefficient and sometimes unrealistic. Therefore, research has focused on achieving real-time collaborative planning by utilizing sensor data from multiple robots located at different positions, without relying on a global map. This paper introduces a Multi-Robot Collaborative Path Planning method based on a Geometric Graph Neural Network (MRPP-GeoGNN). First, the features of each neighboring robot's sensory data are extracted, and the relative positions of neighboring robots are integrated into each interaction layer to incorporate obstacle information along with location details. Subsequently, GeoGNN maps the amalgamated local environment features to multiple forward directions for the robot's actual movement. An expert data generation method is devised for the robot to advance step by step in the physical environment, generating different expert data in ROS to train the network. We conducted both simulations and physical experiments to validate the effectiveness of the proposed method. Simulation results demonstrate approximately a 5% improvement in accuracy compared to the model based solely on CNN using expert datasets. In the ROS simulation test, the success rate is enhanced by about 4% compared to CNN, and the flow time increase is reduced by approximately 8%, surpassing other GNN models. The physical experimental results indicate that the proposed method enables the robot to navigate successfully in the actual environment and achieve the shortest average path length compared to the benchmark method.
Dynamics of Moral Behavior in Heterogeneous Populations of Learning Agents AAAI
Growing concerns about safety and alignment of AI systems highlight the importance of embedding moral capabilities in artificial agents: a promising solution is the use of learning from experience, i.e., Reinforcement Learning. In multi-agent (social) environments, complex population-level phenomena may emerge from interactions between individual learning agents. Many of the existing studies rely on simulated social dilemma environments to study the interactions of independent learning agents; however, they tend to ignore the moral heterogeneity that is likely to be present in societies of agents in practice. For example, at different points in time a single learning agent may face opponents who are consequentialist (i.e., focused on maximizing outcomes over time), norm-based (i.e., conforming to specific norms), or virtue-based (i.e., considering a combination of different virtues). The extent to which agents' co-development may be impacted by such moral heterogeneity in populations is not well understood. In this paper, we present a study of the learning dynamics of morally heterogeneous populations interacting in a social dilemma setting. Using an Iterated Prisoner's Dilemma environment with a partner selection mechanism, we investigate the extent to which the prevalence of diverse moral agents in populations affects individual agents' learning behaviors and emergent population-level outcomes. We observe several types of non-trivial interactions between pro-social and anti-social agents, and find that certain types of moral agents are able to steer selfish agents towards more cooperative behavior.
comment: Presented at AIES 2024 (7th AAAI/ACM Conference on AI, Ethics, and Society - San Jose, CA, USA) https://ojs.aaai.org/index.php/AIES/article/view/31736
TrafficGamer: Reliable and Flexible Traffic Simulation for Safety-Critical Scenarios with Game-Theoretic Oracles
While modern Autonomous Vehicle (AV) systems can develop reliable driving policies under regular traffic conditions, they frequently struggle with safety-critical traffic scenarios. This difficulty primarily arises from the rarity of such scenarios in driving datasets and the complexities associated with predictive modeling among multiple vehicles. To support the testing and refinement of AV policies, simulating safety-critical traffic events is an essential challenge to be addressed. In this work, we introduce TrafficGamer, which facilitates game-theoretic traffic simulation by viewing common road driving as a multi-agent game. In evaluating the empirical performance across various real-world datasets, TrafficGamer ensures both fidelity and exploitability of the simulated scenarios, guaranteeing that they not only statically align with real-world traffic distribution but also efficiently capture equilibriums for representing safety-critical scenarios involving multiple agents. Additionally, the results demonstrate that TrafficGamer exhibits highly flexible simulation across various contexts. Specifically, we demonstrate that the generated scenarios can dynamically adapt to equilibriums of varying tightness by configuring risk-sensitive constraints during optimization. To the best of our knowledge, TrafficGamer is the first simulator capable of generating diverse traffic scenarios involving multiple agents. We have provided a demo webpage for the project at https://qiaoguanren.github.io/trafficgamer-demo/.
A Simulation Environment for the Neuroevolution of Ant Colony Dynamics
We introduce a simulation environment to facilitate research into emergent collective behaviour, with a focus on replicating the dynamics of ant colonies. By leveraging real-world data, the environment simulates a target ant trail that a controllable agent must learn to replicate, using sensory data observed by the target ant. This work aims to contribute to the neuroevolution of models for collective behaviour, focusing on evolving neural architectures that encode domain-specific behaviours in the network topology. By evolving models that can be modified and studied in a controlled environment, we can uncover the necessary conditions required for collective behaviours to emerge. We hope this environment will be useful to those studying the role of interactions in emergent behaviour within collective systems.
comment: Accepted for publication at The 2024 Conference on Artificial Life. 2 page extended abstract
Simulating the Economic Impact of Rationality through Reinforcement Learning and Agent-Based Modelling
Agent-based models (ABMs) are simulation models used in economics to overcome some of the limitations of traditional frameworks based on general equilibrium assumptions. However, agents within an ABM follow predetermined 'bounded rational' behavioural rules which can be cumbersome to design and difficult to justify. Here we leverage multi-agent reinforcement learning (RL) to expand the capabilities of ABMs with the introduction of 'fully rational' agents that learn their policy by interacting with the environment and maximising a reward function. Specifically, we propose a 'Rational macro ABM' (R-MABM) framework by extending a paradigmatic macro ABM from the economic literature. We show that gradually substituting ABM firms in the model with RL agents, trained to maximise profits, allows for studying the impact of rationality on the economy. We find that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality. We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits. Finally, we find that a higher number of rational (RL) agents in the economy always improves the macroeconomic environment as measured by total output. Depending on the specific rational policy, this can come at the cost of higher instability. Our R-MABM framework allows for stable multi-agent learning, is available in open source, and represents a principled and robust direction to extend economic simulators.
comment: 9 pages, 4 figures
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
Remarkable progress has been made on automated problem solving through societies of agents based on large language models (LLMs). Existing LLM-based multi-agent systems can already solve simple dialogue tasks. Solutions to more complex tasks, however, are complicated through logic inconsistencies due to cascading hallucinations caused by naively chaining LLMs. Here we introduce MetaGPT, an innovative meta-programming framework incorporating efficient human workflows into LLM-based multi-agent collaborations. MetaGPT encodes Standardized Operating Procedures (SOPs) into prompt sequences for more streamlined workflows, thus allowing agents with human-like domain expertise to verify intermediate results and reduce errors. MetaGPT utilizes an assembly line paradigm to assign diverse roles to various agents, efficiently breaking down complex tasks into subtasks involving many agents working together. On collaborative software engineering benchmarks, MetaGPT generates more coherent solutions than previous chat-based multi-agent systems. Our project can be found at https://github.com/geekan/MetaGPT
Systems and Control (CS)
Nonlinear Magnetics Model for Permanent Magnet Synchronous Machines Capturing Saturation and Temperature Effects
This paper proposes a nonlinear magnetics model for Permanent Magnet Synchronous Machines (PMSMs) that accurately captures the effects of magnetic saturation in the machine iron and variations in rotor temperature on the permanent magnet excitation. The proposed model considers the permanent magnet as a current source rather than the more commonly used flux-linkage source. A comparison of the two modelling approaches is conducted using Finite Element Analysis (FEA) for different machine designs as well as experimental validation, where it is shown that the proposed model has substantially better accuracy. The proposed model decouples magnetic saturation and rotor temperature effects in the current/flux-linkage relationship, allowing for adaptive estimation of the PM excitation.
Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving
Prevailing wisdom asserts that one cannot rely on the cloud for critical real-time control systems like self-driving cars. We argue that we can, and must. Following the trends of increasing model sizes, improvements in hardware, and evolving mobile networks, we identify an opportunity to offload parts of time-sensitive and latency-critical compute to the cloud. Doing so requires carefully allocating bandwidth to meet strict latency SLOs, while maximizing benefit to the car.
comment: 6 pages
Spiking Neural Networks as a Controller for Emergent Swarm Agents
Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible emergent behaviors in swarms of robots with only a binary sensor and a simple but hand-picked controller structure. Even agents in this highly limited sensing, actuation, and computational capability class can exhibit relatively complex global behaviors such as aggregation, milling, and dispersal, but finding the local interaction rules that enable more collective behaviors remains a significant challenge. This paper investigates the feasibility of training spiking neural networks to find those local interaction rules that result in particular emergent behaviors. In this paper, we focus on simulating a specific milling behavior already known to be producible using very simple binary sensing and acting agents. To do this, we use evolutionary algorithms to evolve not only the parameters (the weights, biases, and delays) of a spiking neural network, but also its structure. To create a baseline, we also show an evolutionary search strategy over the parameters for the incumbent hand-picked binary controller structure. Our simulations show that spiking neural networks can be evolved in binary sensing agents to form a mill.
comment: 8 pages, 7 figures, presented at the 2024 International Conference on Neuromorphic Systems
Fast Physics-Informed Model Predictive Control Approximation for Lyapunov Stability
At the forefront of control techniques is Model Predictive Control (MPC). While MPCs are effective, their requisite to recompute an optimal control given a new state leads to sparse response to the system and may make their implementation infeasible in small systems with low computational resources. To address these limitations in stability control, this research presents a small deterministic Physics-Informed MPC Surrogate model (PI-MPCS). PI-MPCS was developed to approximate the control by an MPC while encouraging stability and robustness through the integration of the system dynamics and the formation of a Lyapunov stability profile. Empirical results are presented on the task of 2D quadcopter landing. They demonstrate a rapid and precise MPC approximation on a non-linear system along with an estimated two times speed up on the computational requirements when compared against an MPC. PI-MPCS, in addition, displays a level of stable control for in- and out-of-distribution states as encouraged by the discrete dynamics residual and Lyapunov stability loss functions. PI-MPCS is meant to serve as a surrogate to MPC on situations in which the computational resources are limited.
Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security
Automating the theory-experiment cycle requires effective distributed workflows that utilize a computing continuum spanning lab instruments, edge sensors, computing resources at multiple facilities, data sets distributed across multiple information sources, and potentially cloud. Unfortunately, the obvious methods for constructing continuum platforms, orchestrating workflow tasks, and curating datasets over time fail to achieve scientific requirements for performance, energy, security, and reliability. Furthermore, achieving the best use of continuum resources depends upon the efficient composition and execution of workflow tasks, i.e., combinations of numerical solvers, data analytics, and machine learning. Pacific Northwest National Laboratory's LDRD "Cloud, High-Performance Computing (HPC), and Edge for Science and Security" (CHESS) has developed a set of interrelated capabilities for enabling distributed scientific workflows and curating datasets. This report describes the results and successes of CHESS from the perspective of open science.
Continuum Robot Shape Estimation Using Magnetic Ball Chains
Shape sensing of medical continuum robots is important both for closed-loop control as well as for enabling the clinician to visualize the robot inside the body. There is a need for inexpensive, but accurate shape sensing technologies. This paper proposes the use of magnetic ball chains as a means of generating shape-specific magnetic fields that can be detected by an external array of Hall effect sensors. Such a ball chain, encased in a flexible polymer sleeve, could be inserted inside the lumen of any continuum robot to provide real-time shape feedback. The sleeve could be removed, as needed, during the procedure to enable use of the entire lumen. To investigate this approach, a shape-sensing model for a steerable catheter tip is derived and an observability and sensitivity analysis are presented. Experiments show maximum estimation errors of 7.1% and mean of 2.9% of the tip position with respect to total length.
Lossless optimal transient control for rigid bodies in 3D space
In this letter, we propose a control scheme for rigid bodies designed to optimise transient behaviors. The search space for the optimal control input is parameterized to yield a passive, specifically lossless, nonlinear feedback controller. As a result, it can be combined with other stabilizing controllers without compromising the stability of the closed-loop system. The controller commands torques generating fictitious gyroscopic effects characteristics of 3D rotational rigid body motions, and as such does not inject nor extract kinetic energy from the system. We validate the controller in simulation using a model predictive control (MPC) scheme, successfully combining stability and performance in a stabilization task with obstacle avoidance constraints.
Neural Predictor for Flight Control with Payload
Aerial robotics for transporting suspended payloads as the form of freely-floating manipulator are growing great interest in recent years. However, the prior information of the payload, such as the mass, is always hard to obtain accurately in practice. The force/torque caused by payload and residual dynamics will introduce unmodeled perturbations to the system, which negatively affects the closed-loop performance. Different from estimation-like methods, this paper proposes Neural Predictor, a learning-based approach to model force/torque caused by payload and residual dynamics as a dynamical system. It results a hybrid model including both the first-principles dynamics and the learned dynamics. This hybrid model is then integrated into a MPC framework to improve closed-loop performance. Effectiveness of proposed framework is verified extensively in both numerical simulations and real-world flight experiments. The results indicate that our approach can capture force/torque caused by payload and residual dynamics accurately, respond quickly to the changes of them and improve the closed-loop performance significantly. In particular, Neural Predictor outperforms a state-of-the-art learning-based estimator and has reduced the force and torque estimation errors by up to 66.15% and 33.33% while using less samples.
comment: 8 pages
Fully distributed and resilient source seeking for robot swarms
We propose a self-contained, resilient and fully distributed solution for locating the maximum of an unknown 3D scalar field using a swarm of robots that travel at constant speeds. Unlike conventional reactive methods relying on gradient information, our methodology enables the swarm to determine an ascending direction so that it approaches the source with arbitrary precision. Our source-seeking solution consists of three algorithms. The first two algorithms run sequentially and distributively at a high frequency providing barycentric coordinates and the ascending direction respectively to the individual robots. The third algorithm is the individual control law for a robot to track the estimated ascending direction. We show that the two algorithms with higher frequency have an exponential convergence to their eventual values since they are based on the standard consensus protocol for first-order dynamical systems; their high frequency depends on how fast the robots travel through the scalar field. The robots are not constrained to any particular geometric formation, and we study both discrete and continuous distributions of robots within swarm shapes. The shape analysis reveals the resiliency of our approach as expected in robot swarms, i.e., by amassing robots we ensure the source-seeking functionality in the event of missing or misplaced individuals or even if the robot network splits into two or more disconnected subnetworks. In addition, we also enhance the robustness of the algorithm by presenting conditions for \emph{optimal} swarm shapes, in the sense that the ascending directions can be closely parallel to the field's gradient. We exploit such an analysis so that the swarm can adapt to unknown environments by morphing its shape and maneuvering while still following an ascending direction.
comment: 15 pages, submitted version to T-RO. This version does not contain the field experiments. arXiv admin note: text overlap with arXiv:2309.02937
Cryogenic Control and Readout Integrated Circuits for Solid-State Quantum Computing
In the pursuit of quantum computing, solid-state quantum systems, particularly superconducting ones, have made remarkable advancements over the past two decades. However, achieving fault-tolerant quantum computing for next-generation applications necessitates the integration of several million qubits, which presents significant challenges in terms of interconnection complexity and latency that are currently unsolvable with state-of-the-art room-temperature control and readout electronics. Recently, cryogenic integrated circuits (ICs), including CMOS radio-frequency ICs and rapid-single-flux-quantum-logic ICs, have emerged as potential alternatives to room-temperature electronics. Unlike their room-temperature counterparts, these ICs are deployed within cryostats to enhance scalability by reducing the number and length of transmission lines. Additionally, operating at cryogenic temperatures can suppress electronic noise and improve qubit control fidelity. However, for CMOS ICs specifically, circuit design uncertainties arise due to a lack of reliable models for cryogenic field effect transistors as well as issues related to severe fickle noises and power dissipation at cryogenic temperatures. This paper provides a comprehensive review of recent research on both types of cryogenic control and readout ICs but primarily focuses on the more mature CMOS technology. The discussion encompasses principles underlying control and readout techniques employed in cryogenic CMOS ICs along with their architectural designs; characterization and modeling approaches for field effect transistors under cryogenic conditions; as well as fundamental concepts pertaining to rapid single flux quantum circuits.
Robust Loop Closure by Textual Cues in Challenging Environments
Loop closure is an important task in robot navigation. However, existing methods mostly rely on some implicit or heuristic features of the environment, which can still fail to work in common environments such as corridors, tunnels, and warehouses. Indeed, navigating in such featureless, degenerative, and repetitive (FDR) environments would also pose a significant challenge even for humans, but explicit text cues in the surroundings often provide the best assistance. This inspires us to propose a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments. Specifically, our approach first extracts scene text entities based on Optical Character Recognition (OCR), then creates a local map of text cues based on accurate LiDAR odometry and finally identifies loop closure events by a graph-theoretic scheme. Experiment results demonstrate that this approach has superior performance over existing methods that rely solely on visual and LiDAR sensors. To benefit the community, we release the source code and datasets at \url{https://github.com/TongxingJin/TXTLCD}.
Integration of Cobalt Ferromagnetic Control Gates for Electrical and Magnetic Manipulation of Semiconductor Quantum Dots
The rise of electron spin qubit architectures for quantum computing processors has led to a strong interest in designing and integrating ferromagnets to induce stray magnetic fields for electron dipole spin resonance (EDSR). The integration of nanomagnets imposes however strict layout and processing constraints, challenging the arrangement of different gating layers and the control of neighboring qubit frequencies. This work reports a successful integration of nano-sized cobalt control gates into a multi-gate FD-SOI nanowire with nanometer-scale dot-to-magnet pitch, simultaneously exploiting electrical and ferromagnetic properties of the gate stack at nanoscale. The electrical characterization of the multi-gate nanowire exhibits full field effect functionality of all ferromagnetic gates from room temperature to 10 mK, proving quantum dot formation when ferromagnets are operated as barrier gates. The front-end-of-line (FEOL) compatible gate-first integration of cobalt is examined by energy dispersive X-ray spectroscopy and high/low frequency capacitance characterization, confirming the quality of interfaces and control over material diffusion. Insights into the magnetic properties of thin films and patterned control-gates are provided by vibrating sample magnetometry and electron holography measurements. Micromagnetic simulations anticipate that this structure fulfills the requirements for EDSR driving for magnetic fields higher than 1 T, where a homogeneous magnetization along the hard magnetic axis of the Co gates is expected. The FDSOI architecture showcased in this study provides a scalable alternative to micromagnets deposited in the back-end-of-line (BEOL) and middle-of-line (MOL) processes, while bringing technological insights for the FEOL-compatible integration of Co nanostructures in spin qubit devices.
comment: 15 pages, 7 figures
A New Method For Flushing of Subsea Production Systems Prior to Decommissioning or Component Disconnection
This paper outlines a novel subsea flushing system which uses a subsea tool to improve the performance of the flushing operation. The new method outlined in this paper uses a small-diameter, high-pressure supply line and a subsea deployed tool containing a pump which recirculates the cleaning fluid through the component or system to be retrieved. The main benefit of this method when compared against conventional practices is that it allows achieving higher fluid speeds inside the subsea equipment being flushed, while injecting smaller flow rates from the surface vessel. The high fluid speeds are achieved with the recirculation pump. The higher fluid speeds ensure efficient sweeping of hydrocarbons from complex paths. A reduced flow rate from the surface vessel also allows a small diameter high pressure supply line to be used, which allows for reduced weight and storage. The study is a numerical simulation of the method applied to a subsea jumper geometry. The injection flow rates required to achieve an efficient flushing were determined from previous experimental work. Calculations were made to estimate the pressure and power requirements for performing the flushing operation as well as the design requirements for the supply line concerning dimensions, material properties and the storage space needed on the support vessel. The performance of the proposed novel system was compared to that of conventional flushing systems. As environmental concerns increase, the presented method has the potential to make the flushing process more efficient while reducing costs associated with support vessels and the materials needed. The novel system may also be deployed using a low-cost Inspection Maintenance and Repair (IMR) vessel. The subsea tool is connected to the subsea production system, either through dedicated connection ports or using pipe clamp connectors with pipe wall penetrators.
Nonlinear Bayesian Filtering with Natural Gradient Gaussian Approximation
Practical Bayes filters often assume the state distribution of each time step to be Gaussian for computational tractability, resulting in the so-called Gaussian filters. When facing nonlinear systems, Gaussian filters such as extended Kalman filter (EKF) or unscented Kalman filter (UKF) typically rely on certain linearization techniques, which can introduce large estimation errors. To address this issue, this paper reconstructs the prediction and update steps of Gaussian filtering as solutions to two distinct optimization problems, whose optimal conditions are found to have analytical forms from Stein's lemma. It is observed that the stationary point for the prediction step requires calculating the first two moments of the prior distribution, which is equivalent to that step in existing moment-matching filters. In the update step, instead of linearizing the model to approximate the stationary points, we propose an iterative approach to directly minimize the update step's objective to avoid linearization errors. For the purpose of performing the steepest descent on the Gaussian manifold, we derive its natural gradient that leverages Fisher information matrix to adjust the gradient direction, accounting for the curvature of the parameter space. Combining this update step with moment matching in the prediction step, we introduce a new iterative filter for nonlinear systems called Natural Gradient Gaussian Approximation filter, or NANO filter for short. We prove that NANO filter locally converges to the optimal Gaussian approximation at each time step. The estimation error is proven exponentially bounded for nearly linear measurement equation and low noise levels through constructing a supermartingale-like inequality across consecutive time steps.
Assisted Physical Interaction: Autonomous Aerial Robots with Neural Network Detection, Navigation, and Safety Layers
The paper introduces a novel framework for safe and autonomous aerial physical interaction in industrial settings. It comprises two main components: a neural network-based target detection system enhanced with edge computing for reduced onboard computational load, and a control barrier function (CBF)-based controller for safe and precise maneuvering. The target detection system is trained on a dataset under challenging visual conditions and evaluated for accuracy across various unseen data with changing lighting conditions. Depth features are utilized for target pose estimation, with the entire detection framework offloaded into low-latency edge computing. The CBF-based controller enables the UAV to converge safely to the target for precise contact. Simulated evaluations of both the controller and target detection are presented, alongside an analysis of real-world detection performance.
comment: 8 pages,14 figures, ICUAS 2024
Design of a Flexible Robot Arm for Safe Aerial Physical Interaction
This paper introduces a novel compliant mechanism combining lightweight and energy dissipation for aerial physical interaction. Weighting 400~g at take-off, the mechanism is actuated in the forward body direction, enabling precise position control for force interaction and various other aerial manipulation tasks. The robotic arm, structured as a closed-loop kinematic chain, employs two deported servomotors. Each joint is actuated with a single tendon for active motion control in compression of the arm at the end-effector. Its elasto-mechanical design reduces weight and provides flexibility, allowing passive-compliant interactions without impacting the motors' integrity. Notably, the arm's damping can be adjusted based on the proposed inner frictional bulges. Experimental applications showcase the aerial system performance in both free-flight and physical interaction. The presented work may open safer applications for \ac{MAV} in real environments subject to perturbations during interaction.
comment: 6 pages, 7 figures, ROBOSOFT 2024
SPARC: Prediction-Based Safe Control for Coupled Controllable and Uncontrollable Agents with Conformal Predictions
We investigate the problem of safe control synthesis for systems operating in environments with uncontrollable agents whose dynamics are unknown but coupled with those of the controlled system. This scenario naturally arises in various applications, such as autonomous driving and human-robot collaboration, where the behavior of uncontrollable agents, like pedestrians, cannot be directly controlled but is influenced by the actions of the autonomous vehicle or robot. In this paper, we present SPARC (Safe Prediction-Based Robust Controller for Coupled Agents), a novel framework designed to ensure safe control in the presence of coupled uncontrollable agents. SPARC leverages conformal prediction to quantify uncertainty in data-driven prediction of agent behavior. Particularly, we introduce a joint distribution-based approach to account for the coupled dynamics of the controlled system and uncontrollable agents. By integrating the control barrier function (CBF) technique, SPARC provides provable safety guarantees at a high confidence level. We illustrate our framework with a case study involving an autonomous driving scenario with walking pedestrians.
Design and Optimization of a Metamaterial Absorber for Solar Energy Harvesting in the THz Frequency Range
This paper introduces the design and comprehensive characterization of a novel three-layer metamaterial absorber, engineered to exploit the unique optical properties of gold, vanadium dioxide, and silicon dioxide. At the core of this design, silicon dioxide serves as a robust substrate that supports an intricately structured layer of gold and a top layer of vanadium dioxide. This configuration is optimized to harness and enhance absorption capabilities effectively across a broadband terahertz (THz) spectrum. The absorber demonstrates an extensive absorption bandwidth of 3.00 THz, spanning frequencies from 2.414 THz to 5.417 THz. Remarkably, throughout this range, the device maintains a consistently high absorption efficiency, exceeding 90%. This efficiency is characterized by two sharp absorption peaks located at 2.638 THz and 5.158 THz, which signify the precise tuning of the metamaterial structure to interact optimally with specific THz frequencies. The absorbance of the proposed model is almost equal to 99%. This absorber is polarization insensitive. The development of this absorber involved a series of theoretical simulations backed by experimental validations, which helped refine the metamaterial's geometry and material composition. This process illuminated the critical role of the dielectric properties of silicon dioxide and the plasmonic effects induced by gold and vanadium dioxide layers, which collectively contribute to the high-performance metrics observed.
Distributed Thompson sampling under constrained communication
In Bayesian optimization, a black-box function is maximized via the use of a surrogate model. We apply distributed Thompson sampling, using a Gaussian process as a surrogate model, to approach the multi-agent Bayesian optimization problem. In our distributed Thompson sampling implementation, each agent receives sampled points from neighbors, where the communication network is encoded in a graph; each agent utilizes a Gaussian process to model the objective function. We demonstrate a theoretical bound on Bayesian Simple Regret, where the bound depends on the size of the largest complete subgraph of the communication graph. Unlike in batch Bayesian optimization, this bound is applicable in cases where the communication graph amongst agents is constrained. When compared to sequential Thompson sampling, our bound guarantees faster convergence with respect to time as long as there is a fully connected subgraph of at least two agents. We confirm the efficacy of our algorithm with numerical simulations on traditional optimization test functions, illustrating the significance of graph connectivity on improving regret convergence.
comment: 9 pages
PEtra: A Flexible and Open-Source PE Loop Tracer for Polymer Thin-Film Transducers
Accurate characterization of ferroelectric properties in polymer piezoelectrics is critical for optimizing the performance of flexible and wearable ultrasound transducers, such as screen-printed PVDF devices. Standard charge measurement techniques, like the Sawyer-Tower circuit, often fall short when applied to ferroelectric polymers due to low-frequency leakage. In this work, we present PEtra, an open-source and versatile piezoelectric loop tracer. PEtra employs a transimpedance amplifier (LMP7721, TI) to convert picoampere-level currents into measurable voltages, covering a frequency range of 0.1 Hz to 5 Hz for a gain setting of 10^7 V/A, and 0.1 Hz to 200 Hz for gain settings between 10^3 V/A to 10^6 V/A (10-fold increments). We demonstrate through simulations and experimental validations that PEtra achieves a sensitivity down to 2 pA, effectively addressing the limitations of traditional charge measurement methods. Compared to the Sawyer-Tower circuit, PEtra directly amplifies currents without the need for a reference capacitor. As a result, it is less susceptible to leakage and can operate at lower frequencies, improving measurement accuracy and reliability. PEtra's design is fully open source, offering researchers and engineers a versatile tool to drive advancements in flexible PVDF transducer technology.
Can Transformers In-Context Learn Behavior of a Linear Dynamical System?
We investigate whether transformers can learn to track a random process when given observations of a related process and parameters of the dynamical system that relates them as context. More specifically, we consider a finite-dimensional state-space model described by the state transition matrix $F$, measurement matrices $h_1, \dots, h_N$, and the process and measurement noise covariance matrices $Q$ and $R$, respectively; these parameters, randomly sampled, are provided to the transformer along with the observations $y_1,\dots,y_N$ generated by the corresponding linear dynamical system. We argue that in such settings transformers learn to approximate the celebrated Kalman filter, and empirically verify this both for the task of estimating hidden states $\hat{x}_{N|1,2,3,...,N}$ as well as for one-step prediction of the $(N+1)^{st}$ observation, $\hat{y}_{N+1|1,2,3,...,N}$. A further study of the transformer's robustness reveals that its performance is retained even if the model's parameters are partially withheld. In particular, we demonstrate that the transformer remains accurate at the considered task even in the absence of state transition and noise covariance matrices, effectively emulating operations of the Dual-Kalman filter.
Residues in Partial Fraction Decomposition Applied to Pole Sensitivity Analysis and Root Locus Construction
The applications of the partial fraction decomposition in control and systems engineering are several. In this letter, we propose a new interpretation of residues in the partial fraction decomposition, which is employed for the following purposes: to address the pole sensitivity problem, namely to study the speed of variation of the system poles when the control parameter changes and when the system is subject to parameters variations, as well as to propose a new algorithm for the construction of the root locus. The new algorithm is proven to be more efficient in terms of execution time than the dedicated MATLAB function, while providing the same output results.
Agent-Based Emulation for Deploying Robot Swarm Behaviors ICRA 2025
Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach by employing an Embodied Agent-Based Modeling and Simulation approach, emphasizing the use of simple robots and identifying conditions that naturally lead to self-organized collective behaviors. Using the Reality-to-Simulation-to-Reality for Swarms (RSRS) process, we tightly integrate real-world experiments with simulations to reproduce known swarm behaviors as well as discovering a novel emergent behavior without aiming to eliminate or even reduce the sim2real gap. This paper presents the development of an Agent-Based Embodiment and Emulation process that balances the importance of running physical swarming experiments and the prohibitively time-consuming process of even setting up and running a single experiment with 20+ robots by leveraging low-fidelity lightweight simulations to enable hypothesis-formation to guide physical experiments. We demonstrate the usefulness of our methods by emulating two known behaviors from the literature and show a third behavior `discovered' by accident.
comment: 8 pages, 6 figures, submitted to ICRA 2025
Policies with Sparse Inter-Agent Dependencies in Dynamic Games: A Dynamic Programming Approach
Common feedback strategies in multi-agent dynamic games require all players' state information to compute control strategies. However, in real-world scenarios, sensing and communication limitations between agents make full state feedback expensive or impractical, and such strategies can become fragile when state information from other agents is inaccurate. To this end, we propose a regularized dynamic programming approach for finding sparse feedback policies that selectively depend on the states of a subset of agents in dynamic games. The proposed approach solves convex adaptive group Lasso problems to compute sparse policies approximating Nash equilibrium solutions. We prove the regularized solutions' asymptotic convergence to a neighborhood of Nash equilibrium policies in linear-quadratic (LQ) games. We extend the proposed approach to general non-LQ games via an iterative algorithm. Empirical results in multi-robot interaction scenarios show that the proposed approach effectively computes feedback policies with varying sparsity levels. When agents have noisy observations of other agents' states, simulation results indicate that the proposed regularized policies consistently achieve lower costs than standard Nash equilibrium policies by up to 77% for all interacting agents whose costs are coupled with other agents' states.
Advancements in Electric Vehicle Charging Optimization: A Survey of Reinforcement Learning Approaches
In response to global warming and energy shortages, there has been a significant shift towards integrating renewable energy sources, energy storage systems, and electric vehicles. Deploying electric vehicles within smart grids offers a promising solution to reduce carbon emissions. However, managing the charging and discharging processes of them as distributed power supplies present significant challenges. Additionally, the intermittent nature of renewable energy, uncertainties in electric vehicle-related parameters, fluctuating energy prices, and varying loads make maintaining stable power system operations more complex. Effective management systems for electric vehicle battery charging are crucial to coordinating these processes and ensuring a secure, efficient, and reliable power system. Reinforcement learning, enhanced by deep learning, has gained substantial interest for its model-free approach and real-time optimization, effectively managing electric vehicle charging by maximizing cumulative rewards. This review synthesizes existing literature on reinforcement learning-based frameworks, objectives, and architectures for electric vehicle charging coordination strategies in power systems, classifying methods into centralized and decentralized categories. Additionally, the article offers suggestions for future research directions to further enhance reinforcement learning-based electric vehicle charging optimization.
comment: 6 pages, 1 Figure
Magnetic Ball Chain Robots for Cardiac Arrhythmia Treatment
This paper introduces a novel magnetic navigation system for cardiac ablation. The system is formed from two key elements: a magnetic ablation catheter consisting of a chain of spherical permanent magnets; and an actuation system comprised of two cart-mounted permanent magnets undergoing pure rotation. The catheter design enables a large magnetic content with the goal of minimizing the footprint of the actuation system for easier integration with the clinical workflow. We present a quasi-static model of the catheter, the design of the actuation units, and their control modalities. Experimental validation shows that we can use small rotating magnets (119mm diameter) to reach cardiac ablation targets while generating clinically-relevant forces. Catheter control using a joystick is compared with manual catheter control. blue While total task completion time is similar, smoother navigation is observed using the proposed robotic system. We also demonstrate that the ball chain can ablate heart tissue and generate lesions comparable to the current clinical ablation catheters.
comment: in IEEE Transactions on Medical Robotics and Bionics, 2024
A Lyapunov-Based Switching Scheme for Selecting the Stable Closed-Loop Fixed Attitude-Error Quaternion During Flight
We present a switching scheme, which uses both the attitude-error quaternion (AEQ) and the angular-velocity error, for controlling the rotational degrees of freedom of an uncrewed aerial vehicle (UAV) during flight. In this approach, the proposed controller continually selects the stable closed-loop (CL) equilibrium AEQ corresponding to the smallest cost between those computed with two energy-based Lyapunov functions. To analyze and enforce the stability of the CL switching dynamics, we use basic nonlinear theory. This research problem is relevant because the selection of the stable CL equilibrium AEQ directly determines the power and energy requirements of the controlled UAV during flight. To test and demonstrate the implementation, suitability, functionality, and performance of the proposed approach, we present experimental results obtained using a 31-gram quadrotor, which was controlled to execute high-speed yaw maneuvers in flight. These flight tests show that the proposed switching controller can respectively reduce the control effort and rotational power by as much as 49.75 % and 28.14 %, on average, compared to those corresponding to an often-used benchmark controller.
comment: 8 pages, 5 figures, 2024 7th Iberian Robotics Conference (ROBOT)
Wireless Resource Optimization in Hybrid Semantic/Bit Communication Networks
Recently, semantic communication (SemCom) has shown great potential in significant resource savings and efficient information exchanges, thus naturally introducing a novel and practical cellular network paradigm where two modes of SemCom and conventional bit communication (BitCom) coexist. Nevertheless, the involved wireless resource management becomes rather complicated and challenging, given the unique background knowledge matching and time-consuming semantic coding requirements in SemCom. To this end, this paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we specially develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by utilizing a Lagrange primal-dual transformation method and a preference list-based heuristic algorithm with polynomial-time complexity. Numerical results not only demonstrate the accuracy of our analytical queuing model, but also validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been accepted for publication by the IEEE Transactions on Communications
Efficient MPC for Emergency Evasive Maneuvers, Part II: Comparative Assessment for Hybrid Control
Optimization-based approaches such as Model Predictive Control (MPC) are promising approaches in proactive control for safety-critical applications with changing environments such as automated driving systems. However, the computational complexity of the MPC optimization problem coupled with the need for real-time control in hazardous scenarios is the main bottleneck in realization of automation levels four and five for driving systems. In this paper, we construct hybrid formulations of the nonlinear MPC problem for tracking control during emergency evasive maneuvers and assess their computational efficiency in terms of accuracy and solution time. To hybridize the MPC problem, we combine three hybrid approximations of the prediction model and four approximations of the nonlinear stability and tire saturation constraints and simulate the closed-loop behavior of the resulting controllers during five emergency maneuvers for different prediction horizons. Further, we compare the robustness of the controllers in the presence of friction uncertainty as well to assess the accuracy-time trade-off in cases where the friction of the road is either unknown or has an offset error with respect to the prediction model. This robustness is studied for different levels of friction uncertainty, as well as investigated with respect to the proximity to the vehicle handling limits. We show that the hybridization of the MPC problem is an efficient approach for real-time implementation of MPC during emergency evasive maneuvers, paving the way for implementation of high levels of automation.
comment: 13 pages, 7 figures, submitted to Journal
Efficient MPC for Emergency Evasive Maneuvers, Part I: Hybridization of the Nonlinear Problem
Despite the extensive application of nonlinear Model Predictive Control (MPC) in automated driving, balancing its computational efficiency with respect to the control performance and constraint satisfaction remains a challenge in emergency scenarios: in such situations, sub-optimal but computationally fast responses are more valuable than optimal responses obtained after long computations. In this paper, we introduce a hybridization approach for efficient approximation of nonlinear vehicle dynamics and non-convex constraints using a hybrid systems modeling framework. Hybridization allows to reformulate the nonlinear MPC problem during emergency evasive maneuvers as a hybrid MPC problem. In this regard, Max-Min-Plus-Scaling (MMPS) hybrid modeling is used to approximate the nonlinear vehicle dynamics. Meanwhile, different formulations for constraint approximation are presented, and various grid-generation methods are compared to solve these approximation problems. Among these, two novel grid types are introduced to structurally include the influence of the system dynamics on the grid point distributions in the state domain. Overall, the work presents and compares three hybrid models and four hybrid constraints for efficient MPC synthesis and offers guidelines for implementation of the presented hybridization framework in other applications.
comment: 13 pages, 7 figures, submitted to journal
Wireless Human-Machine Collaboration in Industry 5.0
Wireless Human-Machine Collaboration (WHMC) represents a critical advancement for Industry 5.0, enabling seamless interaction between humans and machines across geographically distributed systems. As the WHMC systems become increasingly important for achieving complex collaborative control tasks, ensuring their stability is essential for practical deployment and long-term operation. Stability analysis certifies how the closed-loop system will behave under model randomness, which is essential for systems operating with wireless communications. However, the fundamental stability analysis of the WHMC systems remains an unexplored challenge due to the intricate interplay between the stochastic nature of wireless communications, dynamic human operations, and the inherent complexities of control system dynamics. This paper establishes a fundamental WHMC model incorporating dual wireless loops for machine and human control. Our framework accounts for practical factors such as short-packet transmissions, fading channels, and advanced HARQ schemes. We model human control lag as a Markov process, which is crucial for capturing the stochastic nature of human interactions. Building on this model, we propose a stochastic cycle-cost-based approach to derive a stability condition for the WHMC system, expressed in terms of wireless channel statistics, human dynamics, and control parameters. Our findings are validated through extensive numerical simulations and a proof-of-concept experiment, where we developed and tested a novel wireless collaborative cart-pole control system. The results confirm the effectiveness of our approach and provide a robust framework for future research on WHMC systems in more complex environments.
comment: This work has been submitted to the IEEE for possible publication
A New Framework for Nonlinear Kalman Filters
The Kalman filter (KF) is a state estimation algorithm that optimally combines system knowledge and measurements to minimize the mean squared error of the estimated states. While KF was initially designed for linear systems, numerous extensions of it, such as extended Kalman filter (EKF), unscented Kalman filter (UKF), cubature Kalman filter (CKF), etc., have been proposed for nonlinear systems. Although different types of nonlinear KFs have different pros and cons, they all use the same framework of linear KF, which, according to what we found in this paper, tends to give overconfident and less accurate state estimations when the measurement functions are nonlinear. Therefore, in this study, we designed a new framework for nonlinear KFs and showed theoretically and empirically that the new framework estimates the states and covariance matrix more accurately than the old one. The new framework was tested on four different nonlinear KFs and five different tasks, showcasing its ability to reduce the estimation errors by several orders of magnitude in low-measurement-noise conditions, with only about a 10 to 90% increase in computational time. All types of nonlinear KFs can benefit from the new framework, and the benefit will increase as the sensors become more and more accurate in the future. As an example, EKF, the simplest nonlinear KF that was previously believed to work poorly for strongly nonlinear systems, can now provide fast and fairly accurate state estimations with the help of the new framework. The codes are available at https://github.com/Shida-Jiang/A-new-framework-for-nonlinear-Kalman-filters.
comment: Some typo fixed
Data-Driven Dynamics Modeling of Miniature Robotic Blimps Using Neural ODEs With Parameter Auto-Tuning
Miniature robotic blimps, as one type of lighter-than-air aerial vehicles, have attracted increasing attention in the science and engineering community for their enhanced safety, extended endurance, and quieter operation compared to quadrotors. Accurately modeling the dynamics of these robotic blimps poses a significant challenge due to the complex aerodynamics stemming from their large lifting bodies. Traditional first-principle models have difficulty obtaining accurate aerodynamic parameters and often overlook high-order nonlinearities, thus coming to its limit in modeling the motion dynamics of miniature robotic blimps. To tackle this challenge, this letter proposes the Auto-tuning Blimp-oriented Neural Ordinary Differential Equation method (ABNODE), a data-driven approach that integrates first-principle and neural network modeling. Spiraling motion experiments of robotic blimps are conducted, comparing the ABNODE with first-principle and other data-driven benchmark models, the results of which demonstrate the effectiveness of the proposed method.
comment: 8 pages, 8 figures
Data-informed modeling of the formation, persistence, and evolution of social norms and conventions
Social norms and conventions are commonly accepted and adopted behaviors and practices within a social group that guide interactions -- e.g., how to spell a word or how to greet people -- and are central to a group's culture and identity. Understanding the key mechanisms that govern the formation, persistence, and evolution of social norms and conventions in social communities is a problem of paramount importance for a broad range of real-world applications, spanning from preparedness for future emergencies to promotion of sustainable practices. In the past decades, mathematical modeling has emerged as a powerful tool to reproduce and study the complex dynamics of norm and convention change, gaining insights into their mechanisms, and ultimately deriving tools to predict their evolution. The first goal of this chapter is to introduce some of the main mathematical approaches for modeling social norms and conventions, including population models and agent-based models relying on the theories of dynamical systems, evolutionary dynamics, and game theory. The second goal of the chapter is to illustrate how quantitative observations and empirical data can be incorporated into these mathematical models in a systematic manner, establishing a data-based approach to mathematical modeling of formation, persistence, and evolution of social norms and conventions. Finally, current challenges and future opportunities in this growing field of research are discussed.
comment: This is an author's (preprint) version of a book chapter that is part of the Handbook of Visual, Experimental and Computational Mathematics - Bridges through Data
Mathematical Optimization of Resolution Improvement in Structured Light data by Periodic Scanning Motion: Application for Feedback during Lunar Landing
This research explores the enhancement of lunar landing precision through an advanced structured light system, integrating machine learning, Iterative Learning Control (ILC) and Structured Illumination Microscopy (SIM) techniques. By employing Moire fringe patterns for high-precision scanning maneuvers, the study addresses the limitations of conventional structured light systems. A nonlinear mathematical optimization model is developed to refine the world model, optimizing oscillation frequency and amplitude to improve resolution. The findings suggest that this approach can double the conventional resolution, promising significant advancements in the accuracy of lunar landings, with potential real-time application.
comment: 5 pages, 1 figure
Revisiting the Optimal PMU Placement Problem in Multi-Machine Power Networks
To provide real-time visibility of physics-based states, phasor measurement units (PMUs) are deployed throughout power networks. PMU data enable real-time grid monitoring and control -- and are essential in transitioning to smarter grids. Various considerations are taken into account when determining the geographic, optimal PMU placements (OPP). This paper focuses on the control-theoretic, observability aspect of OPP. A myriad of studies have investigated observability-based formulations to determine the OPP within a transmission network. However, they have mostly adopted a simplified representation of system dynamics, ignored basic algebraic equations that model power flows, disregarded including renewables such as solar and wind, and did not model their uncertainty. Consequently, this paper revisits the observability-based OPP problem by addressing the literature's limitations. A nonlinear differential algebraic representation (NDAE) of the power system is considered. The system is discretized using various discretization approaches while explicitly accounting for uncertainty. A moving horizon estimation approach is explored to reconstruct the joint differential and algebraic initial states of the system, as a gateway to the OPP problem which is then formulated as a computationally tractable integer program (IP). Comprehensive numerical simulations on standard power networks are conducted to validate the different aspects of this approach and test its robustness to various dynamical conditions.
Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
Experimenting under Stochastic Congestion
We study randomized experiments in a service system when stochastic congestion can arise from temporarily limited supply or excess demand. Such congestion gives rise to cross-unit interference between the waiting customers, and analytic strategies that do not account for this interference may be biased. In current practice, one of the most widely used ways to address stochastic congestion is to use switchback experiments that alternatively turn a target intervention on and off for the whole system. We find, however, that under a queueing model for stochastic congestion, the standard way of analyzing switchbacks is inefficient, and that estimators that leverage the queueing model can be materially more accurate. Additionally, we show how the queueing model enables estimation of total policy gradients from unit-level randomized experiments, thus giving practitioners an alternative experimental approach they can use without needing to pre-commit to a fixed switchback length before data collection.
Systems and Control (EESS)
Nonlinear Magnetics Model for Permanent Magnet Synchronous Machines Capturing Saturation and Temperature Effects
This paper proposes a nonlinear magnetics model for Permanent Magnet Synchronous Machines (PMSMs) that accurately captures the effects of magnetic saturation in the machine iron and variations in rotor temperature on the permanent magnet excitation. The proposed model considers the permanent magnet as a current source rather than the more commonly used flux-linkage source. A comparison of the two modelling approaches is conducted using Finite Element Analysis (FEA) for different machine designs as well as experimental validation, where it is shown that the proposed model has substantially better accuracy. The proposed model decouples magnetic saturation and rotor temperature effects in the current/flux-linkage relationship, allowing for adaptive estimation of the PM excitation.
Managing Bandwidth: The Key to Cloud-Assisted Autonomous Driving
Prevailing wisdom asserts that one cannot rely on the cloud for critical real-time control systems like self-driving cars. We argue that we can, and must. Following the trends of increasing model sizes, improvements in hardware, and evolving mobile networks, we identify an opportunity to offload parts of time-sensitive and latency-critical compute to the cloud. Doing so requires carefully allocating bandwidth to meet strict latency SLOs, while maximizing benefit to the car.
comment: 6 pages
Spiking Neural Networks as a Controller for Emergent Swarm Agents
Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible emergent behaviors in swarms of robots with only a binary sensor and a simple but hand-picked controller structure. Even agents in this highly limited sensing, actuation, and computational capability class can exhibit relatively complex global behaviors such as aggregation, milling, and dispersal, but finding the local interaction rules that enable more collective behaviors remains a significant challenge. This paper investigates the feasibility of training spiking neural networks to find those local interaction rules that result in particular emergent behaviors. In this paper, we focus on simulating a specific milling behavior already known to be producible using very simple binary sensing and acting agents. To do this, we use evolutionary algorithms to evolve not only the parameters (the weights, biases, and delays) of a spiking neural network, but also its structure. To create a baseline, we also show an evolutionary search strategy over the parameters for the incumbent hand-picked binary controller structure. Our simulations show that spiking neural networks can be evolved in binary sensing agents to form a mill.
comment: 8 pages, 7 figures, presented at the 2024 International Conference on Neuromorphic Systems
Fast Physics-Informed Model Predictive Control Approximation for Lyapunov Stability
At the forefront of control techniques is Model Predictive Control (MPC). While MPCs are effective, their requisite to recompute an optimal control given a new state leads to sparse response to the system and may make their implementation infeasible in small systems with low computational resources. To address these limitations in stability control, this research presents a small deterministic Physics-Informed MPC Surrogate model (PI-MPCS). PI-MPCS was developed to approximate the control by an MPC while encouraging stability and robustness through the integration of the system dynamics and the formation of a Lyapunov stability profile. Empirical results are presented on the task of 2D quadcopter landing. They demonstrate a rapid and precise MPC approximation on a non-linear system along with an estimated two times speed up on the computational requirements when compared against an MPC. PI-MPCS, in addition, displays a level of stable control for in- and out-of-distribution states as encouraged by the discrete dynamics residual and Lyapunov stability loss functions. PI-MPCS is meant to serve as a surrogate to MPC on situations in which the computational resources are limited.
Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security
Automating the theory-experiment cycle requires effective distributed workflows that utilize a computing continuum spanning lab instruments, edge sensors, computing resources at multiple facilities, data sets distributed across multiple information sources, and potentially cloud. Unfortunately, the obvious methods for constructing continuum platforms, orchestrating workflow tasks, and curating datasets over time fail to achieve scientific requirements for performance, energy, security, and reliability. Furthermore, achieving the best use of continuum resources depends upon the efficient composition and execution of workflow tasks, i.e., combinations of numerical solvers, data analytics, and machine learning. Pacific Northwest National Laboratory's LDRD "Cloud, High-Performance Computing (HPC), and Edge for Science and Security" (CHESS) has developed a set of interrelated capabilities for enabling distributed scientific workflows and curating datasets. This report describes the results and successes of CHESS from the perspective of open science.
Continuum Robot Shape Estimation Using Magnetic Ball Chains
Shape sensing of medical continuum robots is important both for closed-loop control as well as for enabling the clinician to visualize the robot inside the body. There is a need for inexpensive, but accurate shape sensing technologies. This paper proposes the use of magnetic ball chains as a means of generating shape-specific magnetic fields that can be detected by an external array of Hall effect sensors. Such a ball chain, encased in a flexible polymer sleeve, could be inserted inside the lumen of any continuum robot to provide real-time shape feedback. The sleeve could be removed, as needed, during the procedure to enable use of the entire lumen. To investigate this approach, a shape-sensing model for a steerable catheter tip is derived and an observability and sensitivity analysis are presented. Experiments show maximum estimation errors of 7.1% and mean of 2.9% of the tip position with respect to total length.
Lossless optimal transient control for rigid bodies in 3D space
In this letter, we propose a control scheme for rigid bodies designed to optimise transient behaviors. The search space for the optimal control input is parameterized to yield a passive, specifically lossless, nonlinear feedback controller. As a result, it can be combined with other stabilizing controllers without compromising the stability of the closed-loop system. The controller commands torques generating fictitious gyroscopic effects characteristics of 3D rotational rigid body motions, and as such does not inject nor extract kinetic energy from the system. We validate the controller in simulation using a model predictive control (MPC) scheme, successfully combining stability and performance in a stabilization task with obstacle avoidance constraints.
Neural Predictor for Flight Control with Payload
Aerial robotics for transporting suspended payloads as the form of freely-floating manipulator are growing great interest in recent years. However, the prior information of the payload, such as the mass, is always hard to obtain accurately in practice. The force/torque caused by payload and residual dynamics will introduce unmodeled perturbations to the system, which negatively affects the closed-loop performance. Different from estimation-like methods, this paper proposes Neural Predictor, a learning-based approach to model force/torque caused by payload and residual dynamics as a dynamical system. It results a hybrid model including both the first-principles dynamics and the learned dynamics. This hybrid model is then integrated into a MPC framework to improve closed-loop performance. Effectiveness of proposed framework is verified extensively in both numerical simulations and real-world flight experiments. The results indicate that our approach can capture force/torque caused by payload and residual dynamics accurately, respond quickly to the changes of them and improve the closed-loop performance significantly. In particular, Neural Predictor outperforms a state-of-the-art learning-based estimator and has reduced the force and torque estimation errors by up to 66.15% and 33.33% while using less samples.
comment: 8 pages
Fully distributed and resilient source seeking for robot swarms
We propose a self-contained, resilient and fully distributed solution for locating the maximum of an unknown 3D scalar field using a swarm of robots that travel at constant speeds. Unlike conventional reactive methods relying on gradient information, our methodology enables the swarm to determine an ascending direction so that it approaches the source with arbitrary precision. Our source-seeking solution consists of three algorithms. The first two algorithms run sequentially and distributively at a high frequency providing barycentric coordinates and the ascending direction respectively to the individual robots. The third algorithm is the individual control law for a robot to track the estimated ascending direction. We show that the two algorithms with higher frequency have an exponential convergence to their eventual values since they are based on the standard consensus protocol for first-order dynamical systems; their high frequency depends on how fast the robots travel through the scalar field. The robots are not constrained to any particular geometric formation, and we study both discrete and continuous distributions of robots within swarm shapes. The shape analysis reveals the resiliency of our approach as expected in robot swarms, i.e., by amassing robots we ensure the source-seeking functionality in the event of missing or misplaced individuals or even if the robot network splits into two or more disconnected subnetworks. In addition, we also enhance the robustness of the algorithm by presenting conditions for \emph{optimal} swarm shapes, in the sense that the ascending directions can be closely parallel to the field's gradient. We exploit such an analysis so that the swarm can adapt to unknown environments by morphing its shape and maneuvering while still following an ascending direction.
comment: 15 pages, submitted version to T-RO. This version does not contain the field experiments. arXiv admin note: text overlap with arXiv:2309.02937
Cryogenic Control and Readout Integrated Circuits for Solid-State Quantum Computing
In the pursuit of quantum computing, solid-state quantum systems, particularly superconducting ones, have made remarkable advancements over the past two decades. However, achieving fault-tolerant quantum computing for next-generation applications necessitates the integration of several million qubits, which presents significant challenges in terms of interconnection complexity and latency that are currently unsolvable with state-of-the-art room-temperature control and readout electronics. Recently, cryogenic integrated circuits (ICs), including CMOS radio-frequency ICs and rapid-single-flux-quantum-logic ICs, have emerged as potential alternatives to room-temperature electronics. Unlike their room-temperature counterparts, these ICs are deployed within cryostats to enhance scalability by reducing the number and length of transmission lines. Additionally, operating at cryogenic temperatures can suppress electronic noise and improve qubit control fidelity. However, for CMOS ICs specifically, circuit design uncertainties arise due to a lack of reliable models for cryogenic field effect transistors as well as issues related to severe fickle noises and power dissipation at cryogenic temperatures. This paper provides a comprehensive review of recent research on both types of cryogenic control and readout ICs but primarily focuses on the more mature CMOS technology. The discussion encompasses principles underlying control and readout techniques employed in cryogenic CMOS ICs along with their architectural designs; characterization and modeling approaches for field effect transistors under cryogenic conditions; as well as fundamental concepts pertaining to rapid single flux quantum circuits.
Robust Loop Closure by Textual Cues in Challenging Environments
Loop closure is an important task in robot navigation. However, existing methods mostly rely on some implicit or heuristic features of the environment, which can still fail to work in common environments such as corridors, tunnels, and warehouses. Indeed, navigating in such featureless, degenerative, and repetitive (FDR) environments would also pose a significant challenge even for humans, but explicit text cues in the surroundings often provide the best assistance. This inspires us to propose a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments. Specifically, our approach first extracts scene text entities based on Optical Character Recognition (OCR), then creates a local map of text cues based on accurate LiDAR odometry and finally identifies loop closure events by a graph-theoretic scheme. Experiment results demonstrate that this approach has superior performance over existing methods that rely solely on visual and LiDAR sensors. To benefit the community, we release the source code and datasets at \url{https://github.com/TongxingJin/TXTLCD}.
Integration of Cobalt Ferromagnetic Control Gates for Electrical and Magnetic Manipulation of Semiconductor Quantum Dots
The rise of electron spin qubit architectures for quantum computing processors has led to a strong interest in designing and integrating ferromagnets to induce stray magnetic fields for electron dipole spin resonance (EDSR). The integration of nanomagnets imposes however strict layout and processing constraints, challenging the arrangement of different gating layers and the control of neighboring qubit frequencies. This work reports a successful integration of nano-sized cobalt control gates into a multi-gate FD-SOI nanowire with nanometer-scale dot-to-magnet pitch, simultaneously exploiting electrical and ferromagnetic properties of the gate stack at nanoscale. The electrical characterization of the multi-gate nanowire exhibits full field effect functionality of all ferromagnetic gates from room temperature to 10 mK, proving quantum dot formation when ferromagnets are operated as barrier gates. The front-end-of-line (FEOL) compatible gate-first integration of cobalt is examined by energy dispersive X-ray spectroscopy and high/low frequency capacitance characterization, confirming the quality of interfaces and control over material diffusion. Insights into the magnetic properties of thin films and patterned control-gates are provided by vibrating sample magnetometry and electron holography measurements. Micromagnetic simulations anticipate that this structure fulfills the requirements for EDSR driving for magnetic fields higher than 1 T, where a homogeneous magnetization along the hard magnetic axis of the Co gates is expected. The FDSOI architecture showcased in this study provides a scalable alternative to micromagnets deposited in the back-end-of-line (BEOL) and middle-of-line (MOL) processes, while bringing technological insights for the FEOL-compatible integration of Co nanostructures in spin qubit devices.
comment: 15 pages, 7 figures
A New Method For Flushing of Subsea Production Systems Prior to Decommissioning or Component Disconnection
This paper outlines a novel subsea flushing system which uses a subsea tool to improve the performance of the flushing operation. The new method outlined in this paper uses a small-diameter, high-pressure supply line and a subsea deployed tool containing a pump which recirculates the cleaning fluid through the component or system to be retrieved. The main benefit of this method when compared against conventional practices is that it allows achieving higher fluid speeds inside the subsea equipment being flushed, while injecting smaller flow rates from the surface vessel. The high fluid speeds are achieved with the recirculation pump. The higher fluid speeds ensure efficient sweeping of hydrocarbons from complex paths. A reduced flow rate from the surface vessel also allows a small diameter high pressure supply line to be used, which allows for reduced weight and storage. The study is a numerical simulation of the method applied to a subsea jumper geometry. The injection flow rates required to achieve an efficient flushing were determined from previous experimental work. Calculations were made to estimate the pressure and power requirements for performing the flushing operation as well as the design requirements for the supply line concerning dimensions, material properties and the storage space needed on the support vessel. The performance of the proposed novel system was compared to that of conventional flushing systems. As environmental concerns increase, the presented method has the potential to make the flushing process more efficient while reducing costs associated with support vessels and the materials needed. The novel system may also be deployed using a low-cost Inspection Maintenance and Repair (IMR) vessel. The subsea tool is connected to the subsea production system, either through dedicated connection ports or using pipe clamp connectors with pipe wall penetrators.
Nonlinear Bayesian Filtering with Natural Gradient Gaussian Approximation
Practical Bayes filters often assume the state distribution of each time step to be Gaussian for computational tractability, resulting in the so-called Gaussian filters. When facing nonlinear systems, Gaussian filters such as extended Kalman filter (EKF) or unscented Kalman filter (UKF) typically rely on certain linearization techniques, which can introduce large estimation errors. To address this issue, this paper reconstructs the prediction and update steps of Gaussian filtering as solutions to two distinct optimization problems, whose optimal conditions are found to have analytical forms from Stein's lemma. It is observed that the stationary point for the prediction step requires calculating the first two moments of the prior distribution, which is equivalent to that step in existing moment-matching filters. In the update step, instead of linearizing the model to approximate the stationary points, we propose an iterative approach to directly minimize the update step's objective to avoid linearization errors. For the purpose of performing the steepest descent on the Gaussian manifold, we derive its natural gradient that leverages Fisher information matrix to adjust the gradient direction, accounting for the curvature of the parameter space. Combining this update step with moment matching in the prediction step, we introduce a new iterative filter for nonlinear systems called Natural Gradient Gaussian Approximation filter, or NANO filter for short. We prove that NANO filter locally converges to the optimal Gaussian approximation at each time step. The estimation error is proven exponentially bounded for nearly linear measurement equation and low noise levels through constructing a supermartingale-like inequality across consecutive time steps.
Assisted Physical Interaction: Autonomous Aerial Robots with Neural Network Detection, Navigation, and Safety Layers
The paper introduces a novel framework for safe and autonomous aerial physical interaction in industrial settings. It comprises two main components: a neural network-based target detection system enhanced with edge computing for reduced onboard computational load, and a control barrier function (CBF)-based controller for safe and precise maneuvering. The target detection system is trained on a dataset under challenging visual conditions and evaluated for accuracy across various unseen data with changing lighting conditions. Depth features are utilized for target pose estimation, with the entire detection framework offloaded into low-latency edge computing. The CBF-based controller enables the UAV to converge safely to the target for precise contact. Simulated evaluations of both the controller and target detection are presented, alongside an analysis of real-world detection performance.
comment: 8 pages,14 figures, ICUAS 2024
Design of a Flexible Robot Arm for Safe Aerial Physical Interaction
This paper introduces a novel compliant mechanism combining lightweight and energy dissipation for aerial physical interaction. Weighting 400~g at take-off, the mechanism is actuated in the forward body direction, enabling precise position control for force interaction and various other aerial manipulation tasks. The robotic arm, structured as a closed-loop kinematic chain, employs two deported servomotors. Each joint is actuated with a single tendon for active motion control in compression of the arm at the end-effector. Its elasto-mechanical design reduces weight and provides flexibility, allowing passive-compliant interactions without impacting the motors' integrity. Notably, the arm's damping can be adjusted based on the proposed inner frictional bulges. Experimental applications showcase the aerial system performance in both free-flight and physical interaction. The presented work may open safer applications for \ac{MAV} in real environments subject to perturbations during interaction.
comment: 6 pages, 7 figures, ROBOSOFT 2024
SPARC: Prediction-Based Safe Control for Coupled Controllable and Uncontrollable Agents with Conformal Predictions
We investigate the problem of safe control synthesis for systems operating in environments with uncontrollable agents whose dynamics are unknown but coupled with those of the controlled system. This scenario naturally arises in various applications, such as autonomous driving and human-robot collaboration, where the behavior of uncontrollable agents, like pedestrians, cannot be directly controlled but is influenced by the actions of the autonomous vehicle or robot. In this paper, we present SPARC (Safe Prediction-Based Robust Controller for Coupled Agents), a novel framework designed to ensure safe control in the presence of coupled uncontrollable agents. SPARC leverages conformal prediction to quantify uncertainty in data-driven prediction of agent behavior. Particularly, we introduce a joint distribution-based approach to account for the coupled dynamics of the controlled system and uncontrollable agents. By integrating the control barrier function (CBF) technique, SPARC provides provable safety guarantees at a high confidence level. We illustrate our framework with a case study involving an autonomous driving scenario with walking pedestrians.
Design and Optimization of a Metamaterial Absorber for Solar Energy Harvesting in the THz Frequency Range
This paper introduces the design and comprehensive characterization of a novel three-layer metamaterial absorber, engineered to exploit the unique optical properties of gold, vanadium dioxide, and silicon dioxide. At the core of this design, silicon dioxide serves as a robust substrate that supports an intricately structured layer of gold and a top layer of vanadium dioxide. This configuration is optimized to harness and enhance absorption capabilities effectively across a broadband terahertz (THz) spectrum. The absorber demonstrates an extensive absorption bandwidth of 3.00 THz, spanning frequencies from 2.414 THz to 5.417 THz. Remarkably, throughout this range, the device maintains a consistently high absorption efficiency, exceeding 90%. This efficiency is characterized by two sharp absorption peaks located at 2.638 THz and 5.158 THz, which signify the precise tuning of the metamaterial structure to interact optimally with specific THz frequencies. The absorbance of the proposed model is almost equal to 99%. This absorber is polarization insensitive. The development of this absorber involved a series of theoretical simulations backed by experimental validations, which helped refine the metamaterial's geometry and material composition. This process illuminated the critical role of the dielectric properties of silicon dioxide and the plasmonic effects induced by gold and vanadium dioxide layers, which collectively contribute to the high-performance metrics observed.
Distributed Thompson sampling under constrained communication
In Bayesian optimization, a black-box function is maximized via the use of a surrogate model. We apply distributed Thompson sampling, using a Gaussian process as a surrogate model, to approach the multi-agent Bayesian optimization problem. In our distributed Thompson sampling implementation, each agent receives sampled points from neighbors, where the communication network is encoded in a graph; each agent utilizes a Gaussian process to model the objective function. We demonstrate a theoretical bound on Bayesian Simple Regret, where the bound depends on the size of the largest complete subgraph of the communication graph. Unlike in batch Bayesian optimization, this bound is applicable in cases where the communication graph amongst agents is constrained. When compared to sequential Thompson sampling, our bound guarantees faster convergence with respect to time as long as there is a fully connected subgraph of at least two agents. We confirm the efficacy of our algorithm with numerical simulations on traditional optimization test functions, illustrating the significance of graph connectivity on improving regret convergence.
comment: 9 pages
PEtra: A Flexible and Open-Source PE Loop Tracer for Polymer Thin-Film Transducers
Accurate characterization of ferroelectric properties in polymer piezoelectrics is critical for optimizing the performance of flexible and wearable ultrasound transducers, such as screen-printed PVDF devices. Standard charge measurement techniques, like the Sawyer-Tower circuit, often fall short when applied to ferroelectric polymers due to low-frequency leakage. In this work, we present PEtra, an open-source and versatile piezoelectric loop tracer. PEtra employs a transimpedance amplifier (LMP7721, TI) to convert picoampere-level currents into measurable voltages, covering a frequency range of 0.1 Hz to 5 Hz for a gain setting of 10^7 V/A, and 0.1 Hz to 200 Hz for gain settings between 10^3 V/A to 10^6 V/A (10-fold increments). We demonstrate through simulations and experimental validations that PEtra achieves a sensitivity down to 2 pA, effectively addressing the limitations of traditional charge measurement methods. Compared to the Sawyer-Tower circuit, PEtra directly amplifies currents without the need for a reference capacitor. As a result, it is less susceptible to leakage and can operate at lower frequencies, improving measurement accuracy and reliability. PEtra's design is fully open source, offering researchers and engineers a versatile tool to drive advancements in flexible PVDF transducer technology.
Can Transformers In-Context Learn Behavior of a Linear Dynamical System?
We investigate whether transformers can learn to track a random process when given observations of a related process and parameters of the dynamical system that relates them as context. More specifically, we consider a finite-dimensional state-space model described by the state transition matrix $F$, measurement matrices $h_1, \dots, h_N$, and the process and measurement noise covariance matrices $Q$ and $R$, respectively; these parameters, randomly sampled, are provided to the transformer along with the observations $y_1,\dots,y_N$ generated by the corresponding linear dynamical system. We argue that in such settings transformers learn to approximate the celebrated Kalman filter, and empirically verify this both for the task of estimating hidden states $\hat{x}_{N|1,2,3,...,N}$ as well as for one-step prediction of the $(N+1)^{st}$ observation, $\hat{y}_{N+1|1,2,3,...,N}$. A further study of the transformer's robustness reveals that its performance is retained even if the model's parameters are partially withheld. In particular, we demonstrate that the transformer remains accurate at the considered task even in the absence of state transition and noise covariance matrices, effectively emulating operations of the Dual-Kalman filter.
Residues in Partial Fraction Decomposition Applied to Pole Sensitivity Analysis and Root Locus Construction
The applications of the partial fraction decomposition in control and systems engineering are several. In this letter, we propose a new interpretation of residues in the partial fraction decomposition, which is employed for the following purposes: to address the pole sensitivity problem, namely to study the speed of variation of the system poles when the control parameter changes and when the system is subject to parameters variations, as well as to propose a new algorithm for the construction of the root locus. The new algorithm is proven to be more efficient in terms of execution time than the dedicated MATLAB function, while providing the same output results.
Agent-Based Emulation for Deploying Robot Swarm Behaviors ICRA 2025
Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach by employing an Embodied Agent-Based Modeling and Simulation approach, emphasizing the use of simple robots and identifying conditions that naturally lead to self-organized collective behaviors. Using the Reality-to-Simulation-to-Reality for Swarms (RSRS) process, we tightly integrate real-world experiments with simulations to reproduce known swarm behaviors as well as discovering a novel emergent behavior without aiming to eliminate or even reduce the sim2real gap. This paper presents the development of an Agent-Based Embodiment and Emulation process that balances the importance of running physical swarming experiments and the prohibitively time-consuming process of even setting up and running a single experiment with 20+ robots by leveraging low-fidelity lightweight simulations to enable hypothesis-formation to guide physical experiments. We demonstrate the usefulness of our methods by emulating two known behaviors from the literature and show a third behavior `discovered' by accident.
comment: 8 pages, 6 figures, submitted to ICRA 2025
Policies with Sparse Inter-Agent Dependencies in Dynamic Games: A Dynamic Programming Approach
Common feedback strategies in multi-agent dynamic games require all players' state information to compute control strategies. However, in real-world scenarios, sensing and communication limitations between agents make full state feedback expensive or impractical, and such strategies can become fragile when state information from other agents is inaccurate. To this end, we propose a regularized dynamic programming approach for finding sparse feedback policies that selectively depend on the states of a subset of agents in dynamic games. The proposed approach solves convex adaptive group Lasso problems to compute sparse policies approximating Nash equilibrium solutions. We prove the regularized solutions' asymptotic convergence to a neighborhood of Nash equilibrium policies in linear-quadratic (LQ) games. We extend the proposed approach to general non-LQ games via an iterative algorithm. Empirical results in multi-robot interaction scenarios show that the proposed approach effectively computes feedback policies with varying sparsity levels. When agents have noisy observations of other agents' states, simulation results indicate that the proposed regularized policies consistently achieve lower costs than standard Nash equilibrium policies by up to 77% for all interacting agents whose costs are coupled with other agents' states.
Advancements in Electric Vehicle Charging Optimization: A Survey of Reinforcement Learning Approaches
In response to global warming and energy shortages, there has been a significant shift towards integrating renewable energy sources, energy storage systems, and electric vehicles. Deploying electric vehicles within smart grids offers a promising solution to reduce carbon emissions. However, managing the charging and discharging processes of them as distributed power supplies present significant challenges. Additionally, the intermittent nature of renewable energy, uncertainties in electric vehicle-related parameters, fluctuating energy prices, and varying loads make maintaining stable power system operations more complex. Effective management systems for electric vehicle battery charging are crucial to coordinating these processes and ensuring a secure, efficient, and reliable power system. Reinforcement learning, enhanced by deep learning, has gained substantial interest for its model-free approach and real-time optimization, effectively managing electric vehicle charging by maximizing cumulative rewards. This review synthesizes existing literature on reinforcement learning-based frameworks, objectives, and architectures for electric vehicle charging coordination strategies in power systems, classifying methods into centralized and decentralized categories. Additionally, the article offers suggestions for future research directions to further enhance reinforcement learning-based electric vehicle charging optimization.
comment: 6 pages, 1 Figure
Magnetic Ball Chain Robots for Cardiac Arrhythmia Treatment
This paper introduces a novel magnetic navigation system for cardiac ablation. The system is formed from two key elements: a magnetic ablation catheter consisting of a chain of spherical permanent magnets; and an actuation system comprised of two cart-mounted permanent magnets undergoing pure rotation. The catheter design enables a large magnetic content with the goal of minimizing the footprint of the actuation system for easier integration with the clinical workflow. We present a quasi-static model of the catheter, the design of the actuation units, and their control modalities. Experimental validation shows that we can use small rotating magnets (119mm diameter) to reach cardiac ablation targets while generating clinically-relevant forces. Catheter control using a joystick is compared with manual catheter control. blue While total task completion time is similar, smoother navigation is observed using the proposed robotic system. We also demonstrate that the ball chain can ablate heart tissue and generate lesions comparable to the current clinical ablation catheters.
comment: in IEEE Transactions on Medical Robotics and Bionics, 2024
A Lyapunov-Based Switching Scheme for Selecting the Stable Closed-Loop Fixed Attitude-Error Quaternion During Flight
We present a switching scheme, which uses both the attitude-error quaternion (AEQ) and the angular-velocity error, for controlling the rotational degrees of freedom of an uncrewed aerial vehicle (UAV) during flight. In this approach, the proposed controller continually selects the stable closed-loop (CL) equilibrium AEQ corresponding to the smallest cost between those computed with two energy-based Lyapunov functions. To analyze and enforce the stability of the CL switching dynamics, we use basic nonlinear theory. This research problem is relevant because the selection of the stable CL equilibrium AEQ directly determines the power and energy requirements of the controlled UAV during flight. To test and demonstrate the implementation, suitability, functionality, and performance of the proposed approach, we present experimental results obtained using a 31-gram quadrotor, which was controlled to execute high-speed yaw maneuvers in flight. These flight tests show that the proposed switching controller can respectively reduce the control effort and rotational power by as much as 49.75 % and 28.14 %, on average, compared to those corresponding to an often-used benchmark controller.
comment: 8 pages, 5 figures, 2024 7th Iberian Robotics Conference (ROBOT)
Wireless Resource Optimization in Hybrid Semantic/Bit Communication Networks
Recently, semantic communication (SemCom) has shown great potential in significant resource savings and efficient information exchanges, thus naturally introducing a novel and practical cellular network paradigm where two modes of SemCom and conventional bit communication (BitCom) coexist. Nevertheless, the involved wireless resource management becomes rather complicated and challenging, given the unique background knowledge matching and time-consuming semantic coding requirements in SemCom. To this end, this paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we specially develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by utilizing a Lagrange primal-dual transformation method and a preference list-based heuristic algorithm with polynomial-time complexity. Numerical results not only demonstrate the accuracy of our analytical queuing model, but also validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been accepted for publication by the IEEE Transactions on Communications
Efficient MPC for Emergency Evasive Maneuvers, Part II: Comparative Assessment for Hybrid Control
Optimization-based approaches such as Model Predictive Control (MPC) are promising approaches in proactive control for safety-critical applications with changing environments such as automated driving systems. However, the computational complexity of the MPC optimization problem coupled with the need for real-time control in hazardous scenarios is the main bottleneck in realization of automation levels four and five for driving systems. In this paper, we construct hybrid formulations of the nonlinear MPC problem for tracking control during emergency evasive maneuvers and assess their computational efficiency in terms of accuracy and solution time. To hybridize the MPC problem, we combine three hybrid approximations of the prediction model and four approximations of the nonlinear stability and tire saturation constraints and simulate the closed-loop behavior of the resulting controllers during five emergency maneuvers for different prediction horizons. Further, we compare the robustness of the controllers in the presence of friction uncertainty as well to assess the accuracy-time trade-off in cases where the friction of the road is either unknown or has an offset error with respect to the prediction model. This robustness is studied for different levels of friction uncertainty, as well as investigated with respect to the proximity to the vehicle handling limits. We show that the hybridization of the MPC problem is an efficient approach for real-time implementation of MPC during emergency evasive maneuvers, paving the way for implementation of high levels of automation.
comment: 13 pages, 7 figures, submitted to Journal
Efficient MPC for Emergency Evasive Maneuvers, Part I: Hybridization of the Nonlinear Problem
Despite the extensive application of nonlinear Model Predictive Control (MPC) in automated driving, balancing its computational efficiency with respect to the control performance and constraint satisfaction remains a challenge in emergency scenarios: in such situations, sub-optimal but computationally fast responses are more valuable than optimal responses obtained after long computations. In this paper, we introduce a hybridization approach for efficient approximation of nonlinear vehicle dynamics and non-convex constraints using a hybrid systems modeling framework. Hybridization allows to reformulate the nonlinear MPC problem during emergency evasive maneuvers as a hybrid MPC problem. In this regard, Max-Min-Plus-Scaling (MMPS) hybrid modeling is used to approximate the nonlinear vehicle dynamics. Meanwhile, different formulations for constraint approximation are presented, and various grid-generation methods are compared to solve these approximation problems. Among these, two novel grid types are introduced to structurally include the influence of the system dynamics on the grid point distributions in the state domain. Overall, the work presents and compares three hybrid models and four hybrid constraints for efficient MPC synthesis and offers guidelines for implementation of the presented hybridization framework in other applications.
comment: 13 pages, 7 figures, submitted to journal
Wireless Human-Machine Collaboration in Industry 5.0
Wireless Human-Machine Collaboration (WHMC) represents a critical advancement for Industry 5.0, enabling seamless interaction between humans and machines across geographically distributed systems. As the WHMC systems become increasingly important for achieving complex collaborative control tasks, ensuring their stability is essential for practical deployment and long-term operation. Stability analysis certifies how the closed-loop system will behave under model randomness, which is essential for systems operating with wireless communications. However, the fundamental stability analysis of the WHMC systems remains an unexplored challenge due to the intricate interplay between the stochastic nature of wireless communications, dynamic human operations, and the inherent complexities of control system dynamics. This paper establishes a fundamental WHMC model incorporating dual wireless loops for machine and human control. Our framework accounts for practical factors such as short-packet transmissions, fading channels, and advanced HARQ schemes. We model human control lag as a Markov process, which is crucial for capturing the stochastic nature of human interactions. Building on this model, we propose a stochastic cycle-cost-based approach to derive a stability condition for the WHMC system, expressed in terms of wireless channel statistics, human dynamics, and control parameters. Our findings are validated through extensive numerical simulations and a proof-of-concept experiment, where we developed and tested a novel wireless collaborative cart-pole control system. The results confirm the effectiveness of our approach and provide a robust framework for future research on WHMC systems in more complex environments.
comment: This work has been submitted to the IEEE for possible publication
A New Framework for Nonlinear Kalman Filters
The Kalman filter (KF) is a state estimation algorithm that optimally combines system knowledge and measurements to minimize the mean squared error of the estimated states. While KF was initially designed for linear systems, numerous extensions of it, such as extended Kalman filter (EKF), unscented Kalman filter (UKF), cubature Kalman filter (CKF), etc., have been proposed for nonlinear systems. Although different types of nonlinear KFs have different pros and cons, they all use the same framework of linear KF, which, according to what we found in this paper, tends to give overconfident and less accurate state estimations when the measurement functions are nonlinear. Therefore, in this study, we designed a new framework for nonlinear KFs and showed theoretically and empirically that the new framework estimates the states and covariance matrix more accurately than the old one. The new framework was tested on four different nonlinear KFs and five different tasks, showcasing its ability to reduce the estimation errors by several orders of magnitude in low-measurement-noise conditions, with only about a 10 to 90% increase in computational time. All types of nonlinear KFs can benefit from the new framework, and the benefit will increase as the sensors become more and more accurate in the future. As an example, EKF, the simplest nonlinear KF that was previously believed to work poorly for strongly nonlinear systems, can now provide fast and fairly accurate state estimations with the help of the new framework. The codes are available at https://github.com/Shida-Jiang/A-new-framework-for-nonlinear-Kalman-filters.
comment: Some typo fixed
Data-Driven Dynamics Modeling of Miniature Robotic Blimps Using Neural ODEs With Parameter Auto-Tuning
Miniature robotic blimps, as one type of lighter-than-air aerial vehicles, have attracted increasing attention in the science and engineering community for their enhanced safety, extended endurance, and quieter operation compared to quadrotors. Accurately modeling the dynamics of these robotic blimps poses a significant challenge due to the complex aerodynamics stemming from their large lifting bodies. Traditional first-principle models have difficulty obtaining accurate aerodynamic parameters and often overlook high-order nonlinearities, thus coming to its limit in modeling the motion dynamics of miniature robotic blimps. To tackle this challenge, this letter proposes the Auto-tuning Blimp-oriented Neural Ordinary Differential Equation method (ABNODE), a data-driven approach that integrates first-principle and neural network modeling. Spiraling motion experiments of robotic blimps are conducted, comparing the ABNODE with first-principle and other data-driven benchmark models, the results of which demonstrate the effectiveness of the proposed method.
comment: 8 pages, 8 figures
Data-informed modeling of the formation, persistence, and evolution of social norms and conventions
Social norms and conventions are commonly accepted and adopted behaviors and practices within a social group that guide interactions -- e.g., how to spell a word or how to greet people -- and are central to a group's culture and identity. Understanding the key mechanisms that govern the formation, persistence, and evolution of social norms and conventions in social communities is a problem of paramount importance for a broad range of real-world applications, spanning from preparedness for future emergencies to promotion of sustainable practices. In the past decades, mathematical modeling has emerged as a powerful tool to reproduce and study the complex dynamics of norm and convention change, gaining insights into their mechanisms, and ultimately deriving tools to predict their evolution. The first goal of this chapter is to introduce some of the main mathematical approaches for modeling social norms and conventions, including population models and agent-based models relying on the theories of dynamical systems, evolutionary dynamics, and game theory. The second goal of the chapter is to illustrate how quantitative observations and empirical data can be incorporated into these mathematical models in a systematic manner, establishing a data-based approach to mathematical modeling of formation, persistence, and evolution of social norms and conventions. Finally, current challenges and future opportunities in this growing field of research are discussed.
comment: This is an author's (preprint) version of a book chapter that is part of the Handbook of Visual, Experimental and Computational Mathematics - Bridges through Data
Mathematical Optimization of Resolution Improvement in Structured Light data by Periodic Scanning Motion: Application for Feedback during Lunar Landing
This research explores the enhancement of lunar landing precision through an advanced structured light system, integrating machine learning, Iterative Learning Control (ILC) and Structured Illumination Microscopy (SIM) techniques. By employing Moire fringe patterns for high-precision scanning maneuvers, the study addresses the limitations of conventional structured light systems. A nonlinear mathematical optimization model is developed to refine the world model, optimizing oscillation frequency and amplitude to improve resolution. The findings suggest that this approach can double the conventional resolution, promising significant advancements in the accuracy of lunar landings, with potential real-time application.
comment: 5 pages, 1 figure
Revisiting the Optimal PMU Placement Problem in Multi-Machine Power Networks
To provide real-time visibility of physics-based states, phasor measurement units (PMUs) are deployed throughout power networks. PMU data enable real-time grid monitoring and control -- and are essential in transitioning to smarter grids. Various considerations are taken into account when determining the geographic, optimal PMU placements (OPP). This paper focuses on the control-theoretic, observability aspect of OPP. A myriad of studies have investigated observability-based formulations to determine the OPP within a transmission network. However, they have mostly adopted a simplified representation of system dynamics, ignored basic algebraic equations that model power flows, disregarded including renewables such as solar and wind, and did not model their uncertainty. Consequently, this paper revisits the observability-based OPP problem by addressing the literature's limitations. A nonlinear differential algebraic representation (NDAE) of the power system is considered. The system is discretized using various discretization approaches while explicitly accounting for uncertainty. A moving horizon estimation approach is explored to reconstruct the joint differential and algebraic initial states of the system, as a gateway to the OPP problem which is then formulated as a computationally tractable integer program (IP). Comprehensive numerical simulations on standard power networks are conducted to validate the different aspects of this approach and test its robustness to various dynamical conditions.
Competency-Aware Planning for Probabilistically Safe Navigation Under Perception Uncertainty
Perception-based navigation systems are useful for unmanned ground vehicle (UGV) navigation in complex terrains, where traditional depth-based navigation schemes are insufficient. However, these data-driven methods are highly dependent on their training data and can fail in surprising and dramatic ways with little warning. To ensure the safety of the vehicle and the surrounding environment, it is imperative that the navigation system is able to recognize the predictive uncertainty of the perception model and respond safely and effectively in the face of uncertainty. In an effort to enable safe navigation under perception uncertainty, we develop a probabilistic and reconstruction-based competency estimation (PaRCE) method to estimate the model's level of familiarity with an input image as a whole and with specific regions in the image. We find that the overall competency score can correctly predict correctly classified, misclassified, and out-of-distribution (OOD) samples. We also confirm that the regional competency maps can accurately distinguish between familiar and unfamiliar regions across images. We then use this competency information to develop a planning and control scheme that enables effective navigation while maintaining a low probability of error. We find that the competency-aware scheme greatly reduces the number of collisions with unfamiliar obstacles, compared to a baseline controller with no competency awareness. Furthermore, the regional competency information is very valuable in enabling efficient navigation.
Experimenting under Stochastic Congestion
We study randomized experiments in a service system when stochastic congestion can arise from temporarily limited supply or excess demand. Such congestion gives rise to cross-unit interference between the waiting customers, and analytic strategies that do not account for this interference may be biased. In current practice, one of the most widely used ways to address stochastic congestion is to use switchback experiments that alternatively turn a target intervention on and off for the whole system. We find, however, that under a queueing model for stochastic congestion, the standard way of analyzing switchbacks is inefficient, and that estimators that leverage the queueing model can be materially more accurate. Additionally, we show how the queueing model enables estimation of total policy gradients from unit-level randomized experiments, thus giving practitioners an alternative experimental approach they can use without needing to pre-commit to a fixed switchback length before data collection.
Robotics
GRS: Generating Robotic Simulation Tasks from Real-World Images
We introduce GRS (Generating Robotic Simulation tasks), a novel system to address the challenge of real-to-sim in robotics, computer vision, and AR/VR. GRS enables the creation of digital twin simulations from single real-world RGB-D observations, complete with diverse, solvable tasks for virtual agent training. We use state-of-the-art vision-language models (VLMs) to achieve a comprehensive real-to-sim pipeline. GRS operates in three stages: 1) scene comprehension using SAM2 for object segmentation and VLMs for object description, 2) matching identified objects with simulation-ready assets, and 3) generating contextually appropriate robotic tasks. Our approach ensures simulations align with task specifications by generating test suites designed to verify adherence to the task specification. We introduce a router that iteratively refines the simulation and test code to ensure the simulation is solvable by a robot policy while remaining aligned to the task specification. Our experiments demonstrate the system's efficacy in accurately identifying object correspondence, which allows us to generate task environments that closely match input environments, and enhance automated simulation task generation through our novel router mechanism.
Quasi-Static Continuum Model of Octopus-Like Soft Robot Arm Under Water Actuated by Twisted and Coiled Artificial Muscles (TCAMs)
The current work is a qualitative study that aims to explore the implementation of Twisted and Coiled Artificial Muscles (TCAMs) for actuating and replicating the bending motion of an octopus-like soft robot arm underwater. Additionally, it investigates the impact of hydrostatic and dynamic forces from steady-state fluid flow on the arm's motion. The artificial muscles are lightweight and low-cost actuators that generate a high power-to-weight ratio, producing tensile force up to 12,600 times their own weight, which is close to the functionality of biological muscles. The "extended" Cosserat theory of rods is employed to formulate a quasi-static continuum model of arm motion, where the arm's cross-section is not only capable of rigid rotation but also deforms within its plane. This planar deformation of the arm cross-section aligns with the biological behavior of the octopus arm, where the stiffness of the hydrostat is directly induced by the incompressibility of the tissues. In line with the main goal, a constitutive model is derived for the material of the octopus arm to capture its characteristic behavior.
comment: 12 pages, Under review at the journal "Robotics Reports"
Generative AI Agents in Autonomous Machines: A Safety Perspective
The integration of Generative Artificial Intelligence (AI) into autonomous machines represents a major paradigm shift in how these systems operate and unlocks new solutions to problems once deemed intractable. Although generative AI agents provide unparalleled capabilities, they also have unique safety concerns. These challenges require robust safeguards, especially for autonomous machines that operate in high-stakes environments. This work investigates the evolving safety requirements when generative models are integrated as agents into physical autonomous machines, comparing these to safety considerations in less critical AI applications. We explore the challenges and opportunities to ensure the safe deployment of generative AI-driven autonomous machines. Furthermore, we provide a forward-looking perspective on the future of AI-driven autonomous systems and emphasize the importance of evaluating and communicating safety risks. As an important step towards addressing these concerns, we recommend the development and implementation of comprehensive safety scorecards for the use of generative AI technologies in autonomous machines.
Evaluating Transferable Emotion Expressions for Zoomorphic Social Robots using VR Prototyping
Zoomorphic robots have the potential to offer companionship and well-being as accessible, low-maintenance alternatives to pet ownership. Many such robots, however, feature limited emotional expression, restricting their potential for rich affective relationships with everyday domestic users. Additionally, exploring this design space using hardware prototyping is obstructed by physical and logistical constraints. We leveraged virtual reality rapid prototyping with passive haptic interaction to conduct a broad mixed-methods evaluation of emotion expression modalities and participatory prototyping of multimodal expressions. We found differences in recognisability, effectiveness and user empathy between modalities while highlighting the importance of facial expressions and the benefits of combining animal-like and unambiguous modalities. We use our findings to inform promising directions for the affective zoomorphic robot design and potential implementations via hardware modification or augmented reality, then discuss how VR prototyping makes this field more accessible to designers and researchers.
comment: 10 pages, 9 figures, accepted to 23rd IEEE International Symposium on Mixed and Augmented Reality (ISMAR 2024)
AssemblyComplete: 3D Combinatorial Construction with Deep Reinforcement Learning
A critical goal in robotics and autonomy is to teach robots to adapt to real-world collaborative tasks, particularly in automatic assembly. The ability of a robot to understand the original intent of an incomplete assembly and complete missing features without human instruction is valuable but challenging. This paper introduces 3D combinatorial assembly completion, which is demonstrated using combinatorial unit primitives (i.e., Lego bricks). Combinatorial assembly is challenging due to the possible assembly combinations and complex physical constraints (e.g., no brick collisions, structure stability, inventory constraints, etc.). To address these challenges, we propose a two-part deep reinforcement learning (DRL) framework that tackles teaching the robot to understand the objective of an incomplete assembly and learning a construction policy to complete the assembly. The robot queries a stable object library to facilitate assembly inference and guide learning. In addition to the robot policy, an action mask is developed to rule out invalid actions that violate physical constraints for object-oriented construction. We demonstrate the proposed framework's feasibility and robustness in a variety of assembly scenarios in which the robot satisfies real-life assembly with respect to both solution and runtime quality. Furthermore, results demonstrate that the proposed framework effectively infers and assembles incomplete structures for unseen and unique object types.
comment: Submitted to 2025 American Control Conference (ACC)
EVA: An Embodied World Model for Future Video Anticipation
World models integrate raw data from various modalities, such as images and language to simulate comprehensive interactions in the world, thereby displaying crucial roles in fields like mixed reality and robotics. Yet, applying the world model for accurate video prediction is quite challenging due to the complex and dynamic intentions of the various scenes in practice. In this paper, inspired by the human rethinking process, we decompose the complex video prediction into four meta-tasks that enable the world model to handle this issue in a more fine-grained manner. Alongside these tasks, we introduce a new benchmark named Embodied Video Anticipation Benchmark (EVA-Bench) to provide a well-rounded evaluation. EVA-Bench focused on evaluating the video prediction ability of human and robot actions, presenting significant challenges for both the language model and the generation model. Targeting embodied video prediction, we propose the Embodied Video Anticipator (EVA), a unified framework aiming at video understanding and generation. EVA integrates a video generation model with a visual language model, effectively combining reasoning capabilities with high-quality generation. Moreover, to enhance the generalization of our framework, we tailor-designed a multi-stage pretraining paradigm that adaptatively ensembles LoRA to produce high-fidelity results. Extensive experiments on EVA-Bench highlight the potential of EVA to significantly improve performance in embodied scenes, paving the way for large-scale pre-trained models in real-world prediction tasks.
Lie Theory Based Optimization for Unified State Planning of Mobile Manipulators
Mobile manipulators are finding use in numerous practical applications. The current issues with mobile manipulation are the large state space owing to the mobile base and the challenge of modeling high degree of freedom systems. It is critical to devise fast and accurate algorithms that generate smooth motion plans for such mobile manipulators. Existing techniques attempt to solve this problem but focus on separating the motion of the base and manipulator. We propose an approach using Lie theory to find the inverse kinematic constraints by converting the kinematic model, created using screw coordinates, between its Lie group and vector representation. An optimization function is devised to solve for the desired joint states of the entire mobile manipulator. This allows the motion of the mobile base and manipulator to be planned and applied in unison resulting in a smooth and accurate motion plan. The performance of the proposed state planner is validated on simulated mobile manipulators in an analytical experiment. Our solver is available with further derivations and results at https://github.com/peleito/slithers.
comment: 8 pages, 9 figures, conference submission
An Agile Large-Workspace Teleoperation Interface Based on Human Arm Motion and Force Estimation
Teleoperation can transfer human perception and cognition to a slave robot to cope with some complex tasks, in which the agility and flexibility of the interface play an important role in mapping human intention to the robot. In this paper, we developed an agile large-workspace teleoperation interface by estimating human arm behavior. Using the wearable sensor, namely the inertial measurement unit and surface electromyography armband, we can capture the human arm motion and force information, thereby intuitively controlling the manipulation of the robot. The control principle of our wearable interface includes two parts: (1) the arm incremental kinematics and (2) the grasping recognition. Moreover, we developed a teleoperation framework with a time synchronization mechanism for the real-time application. We conducted experimental comparisons with a versatile haptic device (Omega 7) to verify the effectiveness of our interface and framework. Seven subjects are invited to complete three different tasks: free motion, handover, and pick-and-place action (each task ten times), and the total number of tests is 420. Objectively, we used the task completion time and success rate to compare the performance of the two interfaces quantitatively. In addition, to quantify the operator experience, we used the NASA Task Load Index to assess their subjective feelings. The results showed that the proposed interface achieved a competitive performance with a better operating experience.
comment: 6 pages, 8 figures, accepted by 2024 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024)
Evaluation of Human-Robot Interfaces based on 2D/3D Visual and Haptic Feedback for Aerial Manipulation
Most telemanipulation systems for aerial robots provide the operator with only 2D screen visual information. The lack of richer information about the robot's status and environment can limit human awareness and, in turn, task performance. While the pilot's experience can often compensate for this reduced flow of information, providing richer feedback is expected to reduce the cognitive workload and offer a more intuitive experience overall. This work aims to understand the significance of providing additional pieces of information during aerial telemanipulation, namely (i) 3D immersive visual feedback about the robot's surroundings through mixed reality (MR) and (ii) 3D haptic feedback about the robot interaction with the environment. To do so, we developed a human-robot interface able to provide this information. First, we demonstrate its potential in a real-world manipulation task requiring sub-centimeter-level accuracy. Then, we evaluate the individual effect of MR vision and haptic feedback on both dexterity and workload through a human subjects study involving a virtual block transportation task. Results show that both 3D MR vision and haptic feedback improve the operator's dexterity in the considered teleoperated aerial interaction tasks. Nevertheless, pilot experience remains the most significant factor.
comment: 12 pages, 11 figures, journal paper
A Semi-decentralized and Variational-Equilibrium-Based Trajectory Planner for Connected and Autonomous Vehicles
This paper designs a novel trajectory planning approach to resolve the computational efficiency and safety problems in uncoordinated methods by exploiting vehicle-to-everything (V2X) technology. The trajectory planning for connected and autonomous vehicles (CAVs) is formulated as a game with coupled safety constraints. We then define interaction-fair trajectories and prove that they correspond to the variational equilibrium (VE) of this game. We propose a semi-decentralized planner for the vehicles to seek VE-based fair trajectories, which can significantly improve computational efficiency through parallel computing among CAVs and enhance the safety of planned trajectories by ensuring equilibrium concordance among CAVs. Finally, experimental results show the advantages of the approach, including fast computation speed, high scalability, equilibrium concordance, and safety.
DynaVINS++: Robust Visual-Inertial State Estimator in Dynamic Environments by Adaptive Truncated Least Squares and Stable State Recovery
Despite extensive research in robust visual-inertial navigation systems~(VINS) in dynamic environments, many approaches remain vulnerable to objects that suddenly start moving, which are referred to as \textit{abruptly dynamic objects}. In addition, most approaches have considered the effect of dynamic objects only at the feature association level. In this study, we observed that the state estimation diverges when errors from false correspondences owing to moving objects incorrectly propagate into the IMU bias terms. To overcome these problems, we propose a robust VINS framework called \mbox{\textit{DynaVINS++}}, which employs a) adaptive truncated least square method that adaptively adjusts the truncation range using both feature association and IMU preintegration to effectively minimize the effect of the dynamic objects while reducing the computational cost, and b)~stable state recovery with bias consistency check to correct misestimated IMU bias and to prevent the divergence caused by abruptly dynamic objects. As verified in both public and real-world datasets, our approach shows promising performance in dynamic environments, including scenes with abruptly dynamic objects.
comment: 8 pages, 7 figures. S. Song, H. Lim, A. J. Lee and H. Myung, "DynaVINS++: Robust Visual-Inertial State Estimator in Dynamic Environments by Adaptive Truncated Least Squares and Stable State Recovery," in IEEE Robotics and Automation Letters, vol. 9, no. 10, pp. 9127-9134, Oct. 2024
Integrated Design and Control of a Robotic Arm on a Quadcopter for Enhanced Package Delivery
This paper presents a comprehensive design process for the integration of a robotic arm into a quadcopter, emphasizing the physical modeling, system integration, and controller development. Utilizing SolidWorks for mechanical design and MATLAB Simscape for simulation and control, this study addresses the challenges encountered in integrating the robotic arm with the drone, encompassing both mechanical and control aspects. Two types of controllers are developed and analyzed: a Proportional-Integral-Derivative (PID) controller and a Model Reference Adaptive Controller (MRAC). The design and tuning of these controllers are key components of this research, with the focus on their application in package delivery tasks. Extensive simulations demonstrate the performance of each controller, with PID controllers exhibiting superior trajectory tracking and lower Root Mean Square (RMS) errors under various payload conditions. The results underscore the efficacy of PID control for stable flight and precise maneuvering, while highlighting adaptability of MRAC to changing dynamics.
Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Simulation, and Real-Vehicle Experiment
With the broader usage and highly successful development of Large Language Models (LLMs), there has been a growth of interest and demand for applying LLMs to autonomous driving technology. Driven by their natural language understanding and reasoning ability, LLMs have the potential to enhance various aspects of autonomous driving systems, from perception and scene understanding to language interaction and decision-making. In this paper, we first introduce novel concepts and approaches to designing LLMs for autonomous driving (LLM4AD). Then, we propose a comprehensive benchmark for evaluating the instruction-following abilities of LLMs within the autonomous driving domain. Furthermore, we conduct a series of experiments on both simulation and real-world vehicle platforms, thoroughly evaluating the performance and potential of our LLM4AD systems. Our research highlights the significant potential of LLMs to enhance various aspects of autonomous vehicle technology, from perception and scene understanding to language interaction and decision-making.
An Image-Guided Robotic System for Transcranial Magnetic Stimulation: System Development and Experimental Evaluation
Transcranial magnetic stimulation (TMS) is a noninvasive medical procedure that can modulate brain activity, and it is widely used in neuroscience and neurology research. Compared to manual operators, robots may improve the outcome of TMS due to their superior accuracy and repeatability. However, there has not been a widely accepted standard protocol for performing robotic TMS using fine-segmented brain images, resulting in arbitrary planned angles with respect to the true boundaries of the modulated cortex. Given that the recent study in TMS simulation suggests a noticeable difference in outcomes when using different anatomical details, cortical shape should play a more significant role in deciding the optimal TMS coil pose. In this work, we introduce an image-guided robotic system for TMS that focuses on (1) establishing standardized planning methods and heuristics to define a reference (true zero) for the coil poses and (2) solving the issue that the manual coil placement requires expert hand-eye coordination which often leading to low repeatability of the experiments. To validate the design of our robotic system, a phantom study and a preliminary human subject study were performed. Our results show that the robotic method can half the positional error and improve the rotational accuracy by up to two orders of magnitude. The accuracy is proven to be repeatable because the standard deviation of multiple trials is lowered by an order of magnitude. The improved actuation accuracy successfully translates to the TMS application, with a higher and more stable induced voltage in magnetic field sensors.
comment: This work has been submitted to the IEEE for possible publication
ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
Reward shaping is a critical component in reinforcement learning (RL), particularly for complex tasks where sparse rewards can hinder learning. While shaping rewards have been introduced to provide additional guidance, selecting effective shaping functions remains challenging and computationally expensive. This paper introduces Online Reward Selection and Policy Optimization (ORSO), a novel approach that frames shaping reward selection as an online model selection problem. ORSO employs principled exploration strategies to automatically identify promising shaping reward functions without human intervention, balancing exploration and exploitation with provable regret guarantees. We demonstrate ORSO's effectiveness across various continuous control tasks using the Isaac Gym simulator. Compared to traditional methods that fully evaluate each shaping reward function, ORSO significantly improves sample efficiency, reduces computational time, and consistently identifies high-quality reward functions that produce policies comparable to those generated by domain experts through hand-engineered rewards.
comment: preprint, 35 pages, 23 figures
Safety-Critical Formation Control of Non-Holonomic Multi-Robot Systems in Communication-Limited Environments
This paper presents a novel estimator-based safety-critical controller for formation control of non-holonomic mobile robots in communication-limited environments. The proposed decentralized framework integrates a robust state estimator with a formation tracking control law, addressing the challenges of inter-agent collision avoidance and disturbance attenuation in leader-follower formations using control barrier functions. The estimator's design accounts for both constant and time-varying velocity profiles, enhancing the system's adaptability to dynamic scenarios. A closed-form solution for the tracking controller facilitates efficient implementation while maintaining formation integrity. The incorporation of string stability metrics further reinforces the framework's resilience against propagating disturbances from predecessors. Rigorous stability analysis using Lyapunov functions ensures the stability of estimation errors and the convergence of the formation to desired configurations. The effectiveness and robustness of the proposed approach are validated through numerical simulations of various maneuvers and realistic Gazebo experiments involving formations in a warehouse environment. The results demonstrate the controller's ability to maintain safety, achieve precise formation control, and mitigate disturbances in scenarios without inter-robot communication.
comment: Under review
Octopus: Embodied Vision-Language Programmer from Environmental Feedback
Large vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning. When integrated into an embodied agent, existing embodied VLM works either output detailed action sequences at the manipulation level or only provide plans at an abstract level, leaving a gap between high-level planning and real-world manipulation. To bridge this gap, we introduce Octopus, an embodied vision-language programmer that uses executable code generation as a medium to connect planning and manipulation. Octopus is designed to 1) proficiently comprehend an agent's visual and textual task objectives, 2) formulate intricate action sequences, and 3) generate executable code. To facilitate Octopus model development, we introduce OctoVerse: a suite of environments tailored for benchmarking vision-based code generators on a wide spectrum of tasks, ranging from mundane daily chores in simulators to sophisticated interactions in complex video games such as Grand Theft Auto (GTA) and Minecraft. To train Octopus, we leverage GPT-4 to control an explorative agent that generates training data, i.e., action blueprints and corresponding executable code. We also collect feedback that enables an enhanced training scheme called Reinforcement Learning with Environmental Feedback (RLEF). Through a series of experiments, we demonstrate Octopus's functionality and present compelling results, showing that the proposed RLEF refines the agent's decision-making. By open-sourcing our simulation environments, dataset, and model architecture, we aspire to ignite further innovation and foster collaborative applications within the broader embodied AI community.
comment: Project Page: https://choiszt.github.io/Octopus/, Codebase: https://github.com/dongyh20/Octopus
DTG : Diffusion-based Trajectory Generation for Mapless Global Navigation
We present a novel end-to-end diffusion-based trajectory generation method, DTG, for mapless global navigation in challenging outdoor scenarios with occlusions and unstructured off-road features like grass, buildings, bushes, etc. Given a distant goal, our approach computes a trajectory that satisfies the following goals: (1) minimize the travel distance to the goal; (2) maximize the traversability by choosing paths that do not lie in undesirable areas. Specifically, we present a novel Conditional RNN(CRNN) for diffusion models to efficiently generate trajectories. Furthermore, we propose an adaptive training method that ensures that the diffusion model generates more traversable trajectories. We evaluate our methods in various outdoor scenes and compare the performance with other global navigation algorithms on a Husky robot. In practice, we observe at least a 15% improvement in traveling distance and around a 7% improvement in traversability.
comment: 10 pages
Rapid and Robust Trajectory Optimization for Humanoids
Performing trajectory design for humanoid robots with high degrees of freedom is computationally challenging. The trajectory design process also often involves carefully selecting various hyperparameters and requires a good initial guess which can further complicate the development process. This work introduces a generalized gait optimization framework that directly generates smooth and physically feasible trajectories. The proposed method demonstrates faster and more robust convergence than existing techniques and explicitly incorporates closed-loop kinematic constraints that appear in many modern humanoids. The method is implemented as an open-source C++ codebase which can be found at https://roahmlab.github.io/RAPTOR/.
ViSaRL: Visual Reinforcement Learning Guided by Human Saliency
Training robots to perform complex control tasks from high-dimensional pixel input using reinforcement learning (RL) is sample-inefficient, because image observations are comprised primarily of task-irrelevant information. By contrast, humans are able to visually attend to task-relevant objects and areas. Based on this insight, we introduce Visual Saliency-Guided Reinforcement Learning (ViSaRL). Using ViSaRL to learn visual representations significantly improves the success rate, sample efficiency, and generalization of an RL agent on diverse tasks including DeepMind Control benchmark, robot manipulation in simulation and on a real robot. We present approaches for incorporating saliency into both CNN and Transformer-based encoders. We show that visual representations learned using ViSaRL are robust to various sources of visual perturbations including perceptual noise and scene variations. ViSaRL nearly doubles success rate on the real-robot tasks compared to the baseline which does not use saliency.
ORLA*: Mobile Manipulator-Based Object Rearrangement with Lazy A Star ICRA 2025
Effectively performing object rearrangement is an essential skill for mobile manipulators, e.g., setting up a dinner table or organizing a desk. A key challenge in such problems is deciding an appropriate manipulation order for objects to effectively untangle dependencies between objects while considering the necessary motions for realizing the manipulations (e.g., pick and place). To our knowledge, computing time-optimal multi-object rearrangement solutions for mobile manipulators remains a largely untapped research direction. In this research, we propose ORLA*, which leverages delayed (lazy) evaluation in searching for a high-quality object pick and place sequence that considers both end-effector and mobile robot base travel. ORLA* also supports multi-layered rearrangement tasks considering pile stability using machine learning. Employing an optimal solver for finding temporary locations for displacing objects, ORLA* can achieve global optimality. Through extensive simulation and ablation study, we confirm the effectiveness of ORLA* delivering quality solutions for challenging rearrangement instances. Supplementary materials are available at: https://gaokai15.github.io/ORLA-Star/
comment: Submitted to ICRA 2025
Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for Autonomous Driving
The integration of Large Language Models (LLMs) into autonomous driving systems demonstrates strong common sense and reasoning abilities, effectively addressing the pitfalls of purely data-driven methods. Current LLM-based agents require lengthy inference times and face challenges in interacting with real-time autonomous driving environments. A key open question is whether we can effectively leverage the knowledge from LLMs to train an efficient and robust Reinforcement Learning (RL) agent. This paper introduces RAPID, a novel \underline{\textbf{R}}obust \underline{\textbf{A}}daptive \underline{\textbf{P}}olicy \underline{\textbf{I}}nfusion and \underline{\textbf{D}}istillation framework, which trains specialized mix-of-policy RL agents using data synthesized by an LLM-based driving agent and online adaptation. RAPID features three key designs: 1) utilization of offline data collected from an LLM agent to distil expert knowledge into RL policies for faster real-time inference; 2) introduction of robust distillation in RL to inherit both performance and robustness from LLM-based teacher; and 3) employment of a mix-of-policy approach for joint decision decoding with a policy adapter. Through fine-tuning via online environment interaction, RAPID reduces the forgetting of LLM knowledge while maintaining adaptability to different tasks. Extensive experiments demonstrate RAPID's capability to effectively integrate LLM knowledge into scaled-down RL policies in an efficient, adaptable, and robust way. Code and checkpoints will be made publicly available upon acceptance.
3D-BBS: Global Localization for 3D Point Cloud Scan Matching Using Branch-and-Bound Algorithm ICRA2024
This paper presents an accurate and fast 3D global localization method, 3D-BBS, that extends the existing branch-and-bound (BnB)-based 2D scan matching (BBS) algorithm. To reduce memory consumption, we utilize a sparse hash table for storing hierarchical 3D voxel maps. To improve the processing cost of BBS in 3D space, we propose an efficient roto-translational space branching. Furthermore, we devise a batched BnB algorithm to fully leverage GPU parallel processing. Through experiments in simulated and real environments, we demonstrated that the 3D-BBS enabled accurate global localization with only a 3D LiDAR scan roughly aligned in the gravity direction and a 3D pre-built map. This method required only 878 msec on average to perform global localization and outperformed state-of-the-art global registration methods in terms of accuracy and processing speed.
comment: IEEE International Conference on Robotics and Automation (ICRA2024)
Mitigating Side Effects in Multi-Agent Systems Using Blame Assignment
When independently trained or designed robots are deployed in a shared environment, their combined actions can lead to unintended negative side effects (NSEs). To ensure safe and efficient operation, robots must optimize task performance while minimizing the penalties associated with NSEs, balancing individual objectives with collective impact. We model the problem of mitigating NSEs in a cooperative multi-agent system as a bi-objective lexicographic decentralized Markov decision process. We assume independence of transitions and rewards with respect to the robots' tasks, but the joint NSE penalty creates a form of dependence in this setting. To improve scalability, the joint NSE penalty is decomposed into individual penalties for each robot using credit assignment, which facilitates decentralized policy computation. We empirically demonstrate, using mobile robots and in simulation, the effectiveness and scalability of our approach in mitigating NSEs.
comment: 8 pages, 5 figures
Granger Causal Interaction Skill Chains
Reinforcement Learning (RL) has demonstrated promising results in learning policies for complex tasks, but it often suffers from low sample efficiency and limited transferability. Hierarchical RL (HRL) methods aim to address the difficulty of learning long-horizon tasks by decomposing policies into skills, abstracting states, and reusing skills in new tasks. However, many HRL methods require some initial task success to discover useful skills, which paradoxically may be very unlikely without access to useful skills. On the other hand, reward-free HRL methods often need to learn far too many skills to achieve proper coverage in high-dimensional domains. In contrast, we introduce the Chain of Interaction Skills (COInS) algorithm, which focuses on controllability in factored domains to identify a small number of task-agnostic skills that still permit a high degree of control. COInS uses learned detectors to identify interactions between state factors and then trains a chain of skills to control each of these factors successively. We evaluate COInS on a robotic pushing task with obstacles -- a challenging domain where other RL and HRL methods fall short. We also demonstrate the transferability of skills learned by COInS, using variants of Breakout, a common RL benchmark, and show 2-3x improvement in both sample efficiency and final performance compared to standard RL baselines.
comment: Accepted TMLR 2024
Multiagent Systems
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
As machine intelligence evolves, the need to test and compare the problem-solving abilities of different AI models grows. However, current benchmarks are often overly simplistic, allowing models to perform uniformly well, making it difficult to distinguish their capabilities. Additionally, benchmarks typically rely on static question-answer pairs, which models might memorize or guess. To address these limitations, we introduce the Dynamic Intelligence Assessment (DIA), a novel methodology for testing AI models using dynamic question templates and improved metrics across multiple disciplines such as mathematics, cryptography, cybersecurity, and computer science. The accompanying DIA-Bench dataset, which includes 150 diverse and challenging task templates with mutable parameters, is presented in various formats such as text, PDFs, compiled binaries, and visual puzzles. Our framework introduces four new metrics to assess a model's reliability and confidence across multiple attempts. These metrics revealed that even simple questions are frequently answered incorrectly when posed in varying forms, highlighting significant gaps in models' reliability. Notably, models like GPT-4o tended to overestimate their mathematical abilities, while ChatGPT-4o demonstrated better decision-making and performance through effective tool usage. We evaluated eight state-of-the-art large language models (LLMs) using DIA-Bench, showing that current models struggle with complex tasks and often display unexpectedly low confidence, even with simpler questions. The DIA framework sets a new standard for assessing not only problem-solving but also a model's adaptive intelligence and ability to assess its own limitations. The dataset is publicly available on our project's website.
A Semi-decentralized and Variational-Equilibrium-Based Trajectory Planner for Connected and Autonomous Vehicles
This paper designs a novel trajectory planning approach to resolve the computational efficiency and safety problems in uncoordinated methods by exploiting vehicle-to-everything (V2X) technology. The trajectory planning for connected and autonomous vehicles (CAVs) is formulated as a game with coupled safety constraints. We then define interaction-fair trajectories and prove that they correspond to the variational equilibrium (VE) of this game. We propose a semi-decentralized planner for the vehicles to seek VE-based fair trajectories, which can significantly improve computational efficiency through parallel computing among CAVs and enhance the safety of planned trajectories by ensuring equilibrium concordance among CAVs. Finally, experimental results show the advantages of the approach, including fast computation speed, high scalability, equilibrium concordance, and safety.
Mitigating Side Effects in Multi-Agent Systems Using Blame Assignment
When independently trained or designed robots are deployed in a shared environment, their combined actions can lead to unintended negative side effects (NSEs). To ensure safe and efficient operation, robots must optimize task performance while minimizing the penalties associated with NSEs, balancing individual objectives with collective impact. We model the problem of mitigating NSEs in a cooperative multi-agent system as a bi-objective lexicographic decentralized Markov decision process. We assume independence of transitions and rewards with respect to the robots' tasks, but the joint NSE penalty creates a form of dependence in this setting. To improve scalability, the joint NSE penalty is decomposed into individual penalties for each robot using credit assignment, which facilitates decentralized policy computation. We empirically demonstrate, using mobile robots and in simulation, the effectiveness and scalability of our approach in mitigating NSEs.
comment: 8 pages, 5 figures
Systems and Control (CS)
A Global Coordinate-Free Approach to Invariant Contraction on Homogeneous Manifolds
In this work, we provide a global condition for contraction with respect to an invariant Riemannian metric on reductive homogeneous spaces. Using left-invariant frames, vector fields on the manifold are horizontally lifted to the ambient Lie group, where the Levi-Civita connection is globally characterized as a real matrix multiplication. By linearizing in these left-invariant frames, we characterize contraction using matrix measures on real square matrices, avoiding the use of local charts. Applying this global condition, we provide a necessary condition for a prescribed subset of the manifold to possibly admit a contracting system with respect to an invariant metric. Applied to the sphere, this condition implies that no closed hemisphere can be contained in a contraction region. Finally, we apply our results to compute reachable sets for an attitude control problem.
A Distributed Primal-Dual Method for Constrained Multi-agent Reinforcement Learning with General Parameterization
This paper proposes a novel distributed approach for solving a cooperative Constrained Multi-agent Reinforcement Learning (CMARL) problem, where agents seek to minimize a global objective function subject to shared constraints. Unlike existing methods that rely on centralized training or coordination, our approach enables fully decentralized online learning, with each agent maintaining local estimates of both primal and dual variables. Specifically, we develop a distributed primal-dual algorithm based on actor-critic methods, leveraging local information to estimate Lagrangian multipliers. We establish consensus among the Lagrangian multipliers across agents and prove the convergence of our algorithm to an equilibrium point, analyzing the sub-optimality of this equilibrium compared to the exact solution of the unparameterized problem. Furthermore, we introduce a constrained cooperative Cournot game with stochastic dynamics as a test environment to evaluate the algorithm's performance in complex, real-world scenarios.
Integrated Design and Control of a Robotic Arm on a Quadcopter for Enhanced Package Delivery
This paper presents a comprehensive design process for the integration of a robotic arm into a quadcopter, emphasizing the physical modeling, system integration, and controller development. Utilizing SolidWorks for mechanical design and MATLAB Simscape for simulation and control, this study addresses the challenges encountered in integrating the robotic arm with the drone, encompassing both mechanical and control aspects. Two types of controllers are developed and analyzed: a Proportional-Integral-Derivative (PID) controller and a Model Reference Adaptive Controller (MRAC). The design and tuning of these controllers are key components of this research, with the focus on their application in package delivery tasks. Extensive simulations demonstrate the performance of each controller, with PID controllers exhibiting superior trajectory tracking and lower Root Mean Square (RMS) errors under various payload conditions. The results underscore the efficacy of PID control for stable flight and precise maneuvering, while highlighting adaptability of MRAC to changing dynamics.
TRIZ Method for Urban Building Energy Optimization: GWO-SARIMA-LSTM Forecasting model
With the advancement of global climate change and sustainable development goals, urban building energy consumption optimization and carbon emission reduction have become the focus of research. Traditional energy consumption prediction methods often lack accuracy and adaptability due to their inability to fully consider complex energy consumption patterns, especially in dealing with seasonal fluctuations and dynamic changes. This study proposes a hybrid deep learning model that combines TRIZ innovation theory with GWO, SARIMA and LSTM to improve the accuracy of building energy consumption prediction. TRIZ plays a key role in model design, providing innovative solutions to achieve an effective balance between energy efficiency, cost and comfort by systematically analyzing the contradictions in energy consumption optimization. GWO is used to optimize the parameters of the model to ensure that the model maintains high accuracy under different conditions. The SARIMA model focuses on capturing seasonal trends in the data, while the LSTM model handles short-term and long-term dependencies in the data, further improving the accuracy of the prediction. The main contribution of this research is the development of a robust model that leverages the strengths of TRIZ and advanced deep learning techniques, improving the accuracy of energy consumption predictions. Our experiments demonstrate a significant 15% reduction in prediction error compared to existing models. This innovative approach not only enhances urban energy management but also provides a new framework for optimizing energy use and reducing carbon emissions, contributing to sustainable development.
comment: 29 pages
Multi-class within-day dynamic traffic equilibrium with strategic travel time information
Most research on within-day dynamic traffic equilibrium with information provision implicitly considers travel time information, often assuming information to be perfect or imperfect based on travelers' perception error. However, lacking explicit formulation of information limits insightful analysis of information impact on dynamic traffic equilibrium and the potential benefits of leveraging information provision to improve system-level performance. To address this gap, this paper proposes a within-day dynamic traffic equilibrium model that explicitly formulates strategic information provision as an endogenous element. In the proposed framework, two classes of travelers receive different types of travel time information: one class receives instantaneous travel time reflecting the prevailing traffic conditions, while the other class receives strategic forecasts of travel times, generated by accounting for travelers' reactions to instantaneous information based on strategic thinking from behavioral game theory. The resulting multi-class within-day dynamic equilibrium differs from existing models by explicitly modeling information provision and consideration of information consistency. The inherent dynamics of real-time updated traffic information, traffic conditions, and travelers' choice behaviors are analytically modeled, with the resulting dynamic equilibrium formulated as a fixed-point problem. The theoretical propositions and numerical findings offer rich insights into the impact of information on the traffic network, strategic forecast information penetration, the relationship between the proposed equilibrium and traditional dynamic traffic equilibria, and information accuracy. This research contributes to a deeper understanding of the interplay between information and traffic dynamics, paving the way for more effective traffic management strategies.
comment: 41 pages, 21 figures
Advancing Gasoline Consumption Forecasting: A Novel Hybrid Model Integrating Transformers, LSTM, and CNN
Iran, endowed with abundant hydrocarbon resources, plays a crucial role in the global energy landscape. Gasoline, as a critical fuel, significantly supports the nation's transportation sector. Accurate forecasting of gasoline consumption is essential for strategic resource management and environmental planning. This research introduces a novel approach to predicting monthly gasoline consumption using a hybrid Transformer-LSTM-CNN model, which integrates the strengths of Transformer networks, Long Short-Term Memory (LSTM) networks, and Convolutional Neural Networks (CNN). This advanced architecture offers a superior alternative to conventional methods such as artificial neural networks and regression models by capturing both short- and long-term dependencies in time series data. By leveraging the self-attention mechanism of Transformers, the temporal memory of LSTMs, and the local pattern detection of CNNs, our hybrid model delivers improved prediction accuracy. Implemented using Python, the model provides precise future gasoline consumption forecasts and evaluates the environmental impact through the analysis of greenhouse gas emissions. This study examines gasoline consumption trends from 2007 to 2021, which rose from 64.5 million liters per day in 2007 to 99.80 million liters per day in 2021. Our proposed model forecasts consumption levels up to 2031, offering a valuable tool for policymakers and energy analysts. The results highlight the superiority of this hybrid model in improving the accuracy of gasoline consumption forecasts, reinforcing the need for advanced machine learning techniques to optimize resource management and mitigate environmental risks in the energy sector.
How many autonomous vehicles are required to stabilize traffic flow?
The collective behavior of human-driven vehicles (HVs) produces the well-known stop-and-go waves potentially leading to higher fuel consumption and emissions. This paper investigates the stabilization of traffic flow via a minimum number of autonomous vehicles (AVs) subject to constraints on the control parameters aiming to reduce the number of vehicles on the road while achieving lower fuel consumption and emissions. The unconstrained scenario has been well-studied in recent studies. The main motivation to investigate the constrained scenario is that, in realistic engineering applications, lower and upper bounds exist on the control parameters. For the constrained scenario, we optimally find the minimum number of required AVs (via computing the optimal lower bound on the AV penetration rate) to stabilize traffic flow for a given number of HVs. As an immediate consequence, we conclude that for a given number of AVs, the number of HVs in the stabilized traffic flow may not be arbitrarily large in the constrained scenario unlike the unconstrained scenario studied in the literature. We systematically propose a procedure to compute the optimal lower bound on the AV penetration rate using nonlinear optimization techniques. Finally, we validate the theoretical results via numerical simulations. Numerical simulations suggest that enlarging the constraint intervals makes a smaller optimal lower bound on the AV penetration rate attainable. However, it leads to a slower transient response due to a dominant pole closer to the origin.
Safety-Critical Formation Control of Non-Holonomic Multi-Robot Systems in Communication-Limited Environments
This paper presents a novel estimator-based safety-critical controller for formation control of non-holonomic mobile robots in communication-limited environments. The proposed decentralized framework integrates a robust state estimator with a formation tracking control law, addressing the challenges of inter-agent collision avoidance and disturbance attenuation in leader-follower formations using control barrier functions. The estimator's design accounts for both constant and time-varying velocity profiles, enhancing the system's adaptability to dynamic scenarios. A closed-form solution for the tracking controller facilitates efficient implementation while maintaining formation integrity. The incorporation of string stability metrics further reinforces the framework's resilience against propagating disturbances from predecessors. Rigorous stability analysis using Lyapunov functions ensures the stability of estimation errors and the convergence of the formation to desired configurations. The effectiveness and robustness of the proposed approach are validated through numerical simulations of various maneuvers and realistic Gazebo experiments involving formations in a warehouse environment. The results demonstrate the controller's ability to maintain safety, achieve precise formation control, and mitigate disturbances in scenarios without inter-robot communication.
comment: Under review
Distributed Error-Identification and Correction using Block-Sparse Optimization
The conventional solutions for fault-detection, identification, and reconstruction (FDIR) require centralized decision-making mechanisms which are typically combinatorial in their nature, necessitating the design of an efficient distributed FDIR mechanism that is suitable for multi-agent applications. To this end, we develop a general framework for efficiently reconstructing a sparse vector being observed over a sensor network via nonlinear measurements. The proposed framework is used to design a distributed multi-agent FDIR algorithm based on a combination of the sequential convex programming (SCP) and the alternating direction method of multipliers (ADMM) optimization approaches. The proposed distributed FDIR algorithm can process a variety of inter-agent measurements (including distances, bearings, relative velocities, and subtended angles between agents) to identify the faulty agents and recover their true states. The effectiveness of the proposed distributed multi-agent FDIR approach is demonstrated by considering a numerical example in which the inter-agent distances are used to identify the faulty agents in a multi-agent configuration, as well as reconstruct their error vectors.
Gain-Only Neural Operators for PDE Backstepping
In this work we advance the recently-introduced deep learning-powered approach to PDE backstepping control by proposing a method that approximates only the control gain function -- a function of one variable -- instead of the entire kernel function of the backstepping transformation, which depends on two variables. This idea is introduced using several benchmark unstable PDEs, including hyperbolic and parabolic types, and extended to 2X2 hyperbolic systems. By employing a backstepping transformation that utilizes the exact kernel (suitable for gain scheduling) rather than an approximated one (suitable for adaptive control), we alter the quantification of the approximation error. This leads to a significant simplification in the target system, shifting the perturbation due to approximation from the domain to the boundary condition. Despite the notable differences in the Lyapunov analysis, we are able to retain stability guarantees with this simplified approximation approach. Approximating only the control gain function simplifies the operator being approximated and the training of its neural approximation, potentially reducing the neural network size. The trade-off for these simplifications is a more intricate Lyapunov analysis, involving higher Sobolev spaces for some PDEs, and certain restrictions on initial conditions arising from these spaces. It is crucial to carefully consider the specific requirements and constraints of each problem to determine the most suitable approach; indeed, recent works have demonstrated successful applications of both full-kernel and gain-only approaches in adaptive control and gain scheduling contexts.
comment: Preprint submitted to CAM
A Control-Recoverable Added-Noise-based Privacy Scheme for LQ Control in Networked Control Systems
As networked control systems continue to evolve, ensuring the privacy of sensitive data becomes an increasingly pressing concern, especially in situations where the controller is physically separated from the plant. In this paper, we propose a secure control scheme for computing linear quadratic control in a networked control system utilizing two networked controllers, a privacy encoder and a control restorer. Specifically, the encoder generates two state signals blurred with random noise and sends them to the controllers, while the restorer reconstructs the correct control signal. The proposed design effectively preserves the privacy of the control system's state without sacrificing the control performance. We theoretically quantify the privacy-preserving performance in terms of the state estimation error of the controllers and the disclosure probability. Moreover, we extend the proposed privacy-preserving scheme and evaluation method to cases where collusion between two controllers occurs. Finally, we verify the validity of our proposed scheme through simulations.
Sparse Mamba: Reinforcing Controllability In Structural State Space Models
In this work, we introduce the concept of controllability and observability to the Mamba SSM's architecture in our Sparse-Mamba (S-Mamba) for natural language processing (NLP) applications. The structured state space model (SSM) development in recent studies, such as Mamba and Mamba2, outperformed and solved the computational inefficiency of transformers and large language models at small to medium scale. The Mamba SSMs architecture drops the need for attention layers or multilayer perception blocks in transformers. However, current Mamba models lack reinforcement of controllability in state-space equations for computing the $A$, $B$, $C$, and $D$ matrices at each time step, leading to increased complexity and computational costs. In this paper, we demonstrate a reduction of parameters in comparison to the first published Mamba and Mamba2. We showcase an improvement in perplexity by 5\% and a decrease in training time by 3\% after reinforcing controllability and observability on the original Mamba architecture in our proposed S-Mamba. The controllable $n \times n$ state matrix $A$ is sparse and it has only $n$ free parameters. Our novel approach will ensure a controllable system which will be the gate key for Mamba3.
Timed Discrete-Event Systems are Synchronous Product Structures
Timed discrete-event systems (TDES), which is a modelling formalism proposed by Brandin and Wonham, can be used for modelling scheduling and production planning problems. This paper aims to show that TDES are essentially synchronous product structures. The proof is constructive in the sense that a generalized synchronous product rule is provided to generate a TDES from the activity automaton and the timer automata (that is, the syntactic description of the TDES) after some model transformation. We then also explain how the generalized synchronous product operation can be reduced into the standard synchronous product operation and how to reduce the number of (refined) events introduced in the model transformation. Thus, any software that can compute synchronous products can be used to compute a TDES from its activity automaton and its timer automata, after the model transformation.
Systems and Control (EESS)
A Global Coordinate-Free Approach to Invariant Contraction on Homogeneous Manifolds
In this work, we provide a global condition for contraction with respect to an invariant Riemannian metric on reductive homogeneous spaces. Using left-invariant frames, vector fields on the manifold are horizontally lifted to the ambient Lie group, where the Levi-Civita connection is globally characterized as a real matrix multiplication. By linearizing in these left-invariant frames, we characterize contraction using matrix measures on real square matrices, avoiding the use of local charts. Applying this global condition, we provide a necessary condition for a prescribed subset of the manifold to possibly admit a contracting system with respect to an invariant metric. Applied to the sphere, this condition implies that no closed hemisphere can be contained in a contraction region. Finally, we apply our results to compute reachable sets for an attitude control problem.
A Distributed Primal-Dual Method for Constrained Multi-agent Reinforcement Learning with General Parameterization
This paper proposes a novel distributed approach for solving a cooperative Constrained Multi-agent Reinforcement Learning (CMARL) problem, where agents seek to minimize a global objective function subject to shared constraints. Unlike existing methods that rely on centralized training or coordination, our approach enables fully decentralized online learning, with each agent maintaining local estimates of both primal and dual variables. Specifically, we develop a distributed primal-dual algorithm based on actor-critic methods, leveraging local information to estimate Lagrangian multipliers. We establish consensus among the Lagrangian multipliers across agents and prove the convergence of our algorithm to an equilibrium point, analyzing the sub-optimality of this equilibrium compared to the exact solution of the unparameterized problem. Furthermore, we introduce a constrained cooperative Cournot game with stochastic dynamics as a test environment to evaluate the algorithm's performance in complex, real-world scenarios.
Integrated Design and Control of a Robotic Arm on a Quadcopter for Enhanced Package Delivery
This paper presents a comprehensive design process for the integration of a robotic arm into a quadcopter, emphasizing the physical modeling, system integration, and controller development. Utilizing SolidWorks for mechanical design and MATLAB Simscape for simulation and control, this study addresses the challenges encountered in integrating the robotic arm with the drone, encompassing both mechanical and control aspects. Two types of controllers are developed and analyzed: a Proportional-Integral-Derivative (PID) controller and a Model Reference Adaptive Controller (MRAC). The design and tuning of these controllers are key components of this research, with the focus on their application in package delivery tasks. Extensive simulations demonstrate the performance of each controller, with PID controllers exhibiting superior trajectory tracking and lower Root Mean Square (RMS) errors under various payload conditions. The results underscore the efficacy of PID control for stable flight and precise maneuvering, while highlighting adaptability of MRAC to changing dynamics.
TRIZ Method for Urban Building Energy Optimization: GWO-SARIMA-LSTM Forecasting model
With the advancement of global climate change and sustainable development goals, urban building energy consumption optimization and carbon emission reduction have become the focus of research. Traditional energy consumption prediction methods often lack accuracy and adaptability due to their inability to fully consider complex energy consumption patterns, especially in dealing with seasonal fluctuations and dynamic changes. This study proposes a hybrid deep learning model that combines TRIZ innovation theory with GWO, SARIMA and LSTM to improve the accuracy of building energy consumption prediction. TRIZ plays a key role in model design, providing innovative solutions to achieve an effective balance between energy efficiency, cost and comfort by systematically analyzing the contradictions in energy consumption optimization. GWO is used to optimize the parameters of the model to ensure that the model maintains high accuracy under different conditions. The SARIMA model focuses on capturing seasonal trends in the data, while the LSTM model handles short-term and long-term dependencies in the data, further improving the accuracy of the prediction. The main contribution of this research is the development of a robust model that leverages the strengths of TRIZ and advanced deep learning techniques, improving the accuracy of energy consumption predictions. Our experiments demonstrate a significant 15% reduction in prediction error compared to existing models. This innovative approach not only enhances urban energy management but also provides a new framework for optimizing energy use and reducing carbon emissions, contributing to sustainable development.
comment: 29 pages
Multi-class within-day dynamic traffic equilibrium with strategic travel time information
Most research on within-day dynamic traffic equilibrium with information provision implicitly considers travel time information, often assuming information to be perfect or imperfect based on travelers' perception error. However, lacking explicit formulation of information limits insightful analysis of information impact on dynamic traffic equilibrium and the potential benefits of leveraging information provision to improve system-level performance. To address this gap, this paper proposes a within-day dynamic traffic equilibrium model that explicitly formulates strategic information provision as an endogenous element. In the proposed framework, two classes of travelers receive different types of travel time information: one class receives instantaneous travel time reflecting the prevailing traffic conditions, while the other class receives strategic forecasts of travel times, generated by accounting for travelers' reactions to instantaneous information based on strategic thinking from behavioral game theory. The resulting multi-class within-day dynamic equilibrium differs from existing models by explicitly modeling information provision and consideration of information consistency. The inherent dynamics of real-time updated traffic information, traffic conditions, and travelers' choice behaviors are analytically modeled, with the resulting dynamic equilibrium formulated as a fixed-point problem. The theoretical propositions and numerical findings offer rich insights into the impact of information on the traffic network, strategic forecast information penetration, the relationship between the proposed equilibrium and traditional dynamic traffic equilibria, and information accuracy. This research contributes to a deeper understanding of the interplay between information and traffic dynamics, paving the way for more effective traffic management strategies.
comment: 41 pages, 21 figures
Advancing Gasoline Consumption Forecasting: A Novel Hybrid Model Integrating Transformers, LSTM, and CNN
Iran, endowed with abundant hydrocarbon resources, plays a crucial role in the global energy landscape. Gasoline, as a critical fuel, significantly supports the nation's transportation sector. Accurate forecasting of gasoline consumption is essential for strategic resource management and environmental planning. This research introduces a novel approach to predicting monthly gasoline consumption using a hybrid Transformer-LSTM-CNN model, which integrates the strengths of Transformer networks, Long Short-Term Memory (LSTM) networks, and Convolutional Neural Networks (CNN). This advanced architecture offers a superior alternative to conventional methods such as artificial neural networks and regression models by capturing both short- and long-term dependencies in time series data. By leveraging the self-attention mechanism of Transformers, the temporal memory of LSTMs, and the local pattern detection of CNNs, our hybrid model delivers improved prediction accuracy. Implemented using Python, the model provides precise future gasoline consumption forecasts and evaluates the environmental impact through the analysis of greenhouse gas emissions. This study examines gasoline consumption trends from 2007 to 2021, which rose from 64.5 million liters per day in 2007 to 99.80 million liters per day in 2021. Our proposed model forecasts consumption levels up to 2031, offering a valuable tool for policymakers and energy analysts. The results highlight the superiority of this hybrid model in improving the accuracy of gasoline consumption forecasts, reinforcing the need for advanced machine learning techniques to optimize resource management and mitigate environmental risks in the energy sector.
How many autonomous vehicles are required to stabilize traffic flow?
The collective behavior of human-driven vehicles (HVs) produces the well-known stop-and-go waves potentially leading to higher fuel consumption and emissions. This paper investigates the stabilization of traffic flow via a minimum number of autonomous vehicles (AVs) subject to constraints on the control parameters aiming to reduce the number of vehicles on the road while achieving lower fuel consumption and emissions. The unconstrained scenario has been well-studied in recent studies. The main motivation to investigate the constrained scenario is that, in realistic engineering applications, lower and upper bounds exist on the control parameters. For the constrained scenario, we optimally find the minimum number of required AVs (via computing the optimal lower bound on the AV penetration rate) to stabilize traffic flow for a given number of HVs. As an immediate consequence, we conclude that for a given number of AVs, the number of HVs in the stabilized traffic flow may not be arbitrarily large in the constrained scenario unlike the unconstrained scenario studied in the literature. We systematically propose a procedure to compute the optimal lower bound on the AV penetration rate using nonlinear optimization techniques. Finally, we validate the theoretical results via numerical simulations. Numerical simulations suggest that enlarging the constraint intervals makes a smaller optimal lower bound on the AV penetration rate attainable. However, it leads to a slower transient response due to a dominant pole closer to the origin.
Safety-Critical Formation Control of Non-Holonomic Multi-Robot Systems in Communication-Limited Environments
This paper presents a novel estimator-based safety-critical controller for formation control of non-holonomic mobile robots in communication-limited environments. The proposed decentralized framework integrates a robust state estimator with a formation tracking control law, addressing the challenges of inter-agent collision avoidance and disturbance attenuation in leader-follower formations using control barrier functions. The estimator's design accounts for both constant and time-varying velocity profiles, enhancing the system's adaptability to dynamic scenarios. A closed-form solution for the tracking controller facilitates efficient implementation while maintaining formation integrity. The incorporation of string stability metrics further reinforces the framework's resilience against propagating disturbances from predecessors. Rigorous stability analysis using Lyapunov functions ensures the stability of estimation errors and the convergence of the formation to desired configurations. The effectiveness and robustness of the proposed approach are validated through numerical simulations of various maneuvers and realistic Gazebo experiments involving formations in a warehouse environment. The results demonstrate the controller's ability to maintain safety, achieve precise formation control, and mitigate disturbances in scenarios without inter-robot communication.
comment: Under review
Distributed Error-Identification and Correction using Block-Sparse Optimization
The conventional solutions for fault-detection, identification, and reconstruction (FDIR) require centralized decision-making mechanisms which are typically combinatorial in their nature, necessitating the design of an efficient distributed FDIR mechanism that is suitable for multi-agent applications. To this end, we develop a general framework for efficiently reconstructing a sparse vector being observed over a sensor network via nonlinear measurements. The proposed framework is used to design a distributed multi-agent FDIR algorithm based on a combination of the sequential convex programming (SCP) and the alternating direction method of multipliers (ADMM) optimization approaches. The proposed distributed FDIR algorithm can process a variety of inter-agent measurements (including distances, bearings, relative velocities, and subtended angles between agents) to identify the faulty agents and recover their true states. The effectiveness of the proposed distributed multi-agent FDIR approach is demonstrated by considering a numerical example in which the inter-agent distances are used to identify the faulty agents in a multi-agent configuration, as well as reconstruct their error vectors.
Gain-Only Neural Operators for PDE Backstepping
In this work we advance the recently-introduced deep learning-powered approach to PDE backstepping control by proposing a method that approximates only the control gain function -- a function of one variable -- instead of the entire kernel function of the backstepping transformation, which depends on two variables. This idea is introduced using several benchmark unstable PDEs, including hyperbolic and parabolic types, and extended to 2X2 hyperbolic systems. By employing a backstepping transformation that utilizes the exact kernel (suitable for gain scheduling) rather than an approximated one (suitable for adaptive control), we alter the quantification of the approximation error. This leads to a significant simplification in the target system, shifting the perturbation due to approximation from the domain to the boundary condition. Despite the notable differences in the Lyapunov analysis, we are able to retain stability guarantees with this simplified approximation approach. Approximating only the control gain function simplifies the operator being approximated and the training of its neural approximation, potentially reducing the neural network size. The trade-off for these simplifications is a more intricate Lyapunov analysis, involving higher Sobolev spaces for some PDEs, and certain restrictions on initial conditions arising from these spaces. It is crucial to carefully consider the specific requirements and constraints of each problem to determine the most suitable approach; indeed, recent works have demonstrated successful applications of both full-kernel and gain-only approaches in adaptive control and gain scheduling contexts.
comment: Preprint submitted to CAM
A Control-Recoverable Added-Noise-based Privacy Scheme for LQ Control in Networked Control Systems
As networked control systems continue to evolve, ensuring the privacy of sensitive data becomes an increasingly pressing concern, especially in situations where the controller is physically separated from the plant. In this paper, we propose a secure control scheme for computing linear quadratic control in a networked control system utilizing two networked controllers, a privacy encoder and a control restorer. Specifically, the encoder generates two state signals blurred with random noise and sends them to the controllers, while the restorer reconstructs the correct control signal. The proposed design effectively preserves the privacy of the control system's state without sacrificing the control performance. We theoretically quantify the privacy-preserving performance in terms of the state estimation error of the controllers and the disclosure probability. Moreover, we extend the proposed privacy-preserving scheme and evaluation method to cases where collusion between two controllers occurs. Finally, we verify the validity of our proposed scheme through simulations.
Sparse Mamba: Reinforcing Controllability In Structural State Space Models
In this work, we introduce the concept of controllability and observability to the Mamba SSM's architecture in our Sparse-Mamba (S-Mamba) for natural language processing (NLP) applications. The structured state space model (SSM) development in recent studies, such as Mamba and Mamba2, outperformed and solved the computational inefficiency of transformers and large language models at small to medium scale. The Mamba SSMs architecture drops the need for attention layers or multilayer perception blocks in transformers. However, current Mamba models lack reinforcement of controllability in state-space equations for computing the $A$, $B$, $C$, and $D$ matrices at each time step, leading to increased complexity and computational costs. In this paper, we demonstrate a reduction of parameters in comparison to the first published Mamba and Mamba2. We showcase an improvement in perplexity by 5\% and a decrease in training time by 3\% after reinforcing controllability and observability on the original Mamba architecture in our proposed S-Mamba. The controllable $n \times n$ state matrix $A$ is sparse and it has only $n$ free parameters. Our novel approach will ensure a controllable system which will be the gate key for Mamba3.
Timed Discrete-Event Systems are Synchronous Product Structures
Timed discrete-event systems (TDES), which is a modelling formalism proposed by Brandin and Wonham, can be used for modelling scheduling and production planning problems. This paper aims to show that TDES are essentially synchronous product structures. The proof is constructive in the sense that a generalized synchronous product rule is provided to generate a TDES from the activity automaton and the timer automata (that is, the syntactic description of the TDES) after some model transformation. We then also explain how the generalized synchronous product operation can be reduced into the standard synchronous product operation and how to reduce the number of (refined) events introduced in the model transformation. Thus, any software that can compute synchronous products can be used to compute a TDES from its activity automaton and its timer automata, after the model transformation.
Robotics
Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards
Ensuring safe interactions in human-centric environments requires robots to understand and adhere to constraints recognized by humans as "common sense" (e.g., "moving a cup of water above a laptop is unsafe as the water may spill" or "rotating a cup of water is unsafe as it can lead to pouring its content"). Recent advances in computer vision and machine learning have enabled robots to acquire a semantic understanding of and reason about their operating environments. While extensive literature on safe robot decision-making exists, semantic understanding is rarely integrated into these formulations. In this work, we propose a semantic safety filter framework to certify robot inputs with respect to semantically defined constraints (e.g., unsafe spatial relationships, behaviours, and poses) and geometrically defined constraints (e.g., environment-collision and self-collision constraints). In our proposed approach, given perception inputs, we build a semantic map of the 3D environment and leverage the contextual reasoning capabilities of large language models to infer semantically unsafe conditions. These semantically unsafe conditions are then mapped to safe actions through a control barrier certification formulation. We evaluated our semantic safety filter approach in teleoperated tabletop manipulation tasks and pick-and-place tasks, demonstrating its effectiveness in incorporating semantic constraints to ensure safe robot operation beyond collision avoidance.
comment: 8 pages, 7 figures
Enhancing Robot Navigation Policies with Task-Specific Uncertainty Management
Robots performing navigation tasks in complex environments face significant challenges due to uncertainty in state estimation. Effectively managing this uncertainty is crucial, but the optimal approach varies depending on the specific details of the task: different tasks require varying levels of precision in different regions of the environment. For instance, a robot navigating a crowded space might need precise localization near obstacles but can operate effectively with less precise state estimates in open areas. This varying need for certainty in different parts of the environment, depending on the task, calls for policies that can adapt their uncertainty management strategies based on task-specific requirements. In this paper, we present a framework for integrating task-specific uncertainty requirements directly into navigation policies. We introduce Task-Specific Uncertainty Map (TSUM), which represents acceptable levels of state estimation uncertainty across different regions of the operating environment for a given task. Using TSUM, we propose Generalized Uncertainty Integration for Decision-Making and Execution (GUIDE), a policy conditioning framework that incorporates these uncertainty requirements into the robot's decision-making process. We find that conditioning policies on TSUMs provides an effective way to express task-specific uncertainty requirements and enables the robot to reason about the context-dependent value of certainty. We show how integrating GUIDE into reinforcement learning frameworks allows the agent to learn navigation policies without the need for explicit reward engineering to balance task completion and uncertainty management. We evaluate GUIDE on a variety of real-world navigation tasks and find that it demonstrates significant improvements in task completion rates compared to baselines. Evaluation videos can be found at https://guided-agents.github.io.
MeshDMP: Motion Planning on Discrete Manifolds using Dynamic Movement Primitives
An open problem in industrial automation is to reliably perform tasks requiring in-contact movements with complex workpieces, as current solutions lack the ability to seamlessly adapt to the workpiece geometry. In this paper, we propose a Learning from Demonstration approach that allows a robot manipulator to learn and generalise motions across complex surfaces by leveraging differential mathematical operators on discrete manifolds to embed information on the geometry of the workpiece extracted from triangular meshes, and extend the Dynamic Movement Primitives (DMPs) framework to generate motions on the mesh surfaces. We also propose an effective strategy to adapt the motion to different surfaces, by introducing an isometric transformation of the learned forcing term. The resulting approach, namely MeshDMP, is evaluated both in simulation and real experiments, showing promising results in typical industrial automation tasks like car surface polishing.
comment: Submitted at the 2025 IEEE International Conference on Robotics and Automation
A Cycle Ride to HDR: Semantics Aware Self-Supervised Framework for Unpaired LDR-to-HDR Image Translation
Low Dynamic Range (LDR) to High Dynamic Range (HDR) image translation is an important computer vision problem. There is a significant amount of research utilizing both conventional non-learning methods and modern data-driven approaches, focusing on using both single-exposed and multi-exposed LDR for HDR image reconstruction. However, most current state-of-the-art methods require high-quality paired {LDR,HDR} datasets for model training. In addition, there is limited literature on using unpaired datasets for this task where the model learns a mapping between domains, i.e., LDR to HDR. To address limitations of current methods, such as the paired data constraint , as well as unwanted blurring and visual artifacts in the reconstructed HDR, we propose a method that uses a modified cycle-consistent adversarial architecture and utilizes unpaired {LDR,HDR} datasets for training. The method introduces novel generators to address visual artifact removal and an encoder and loss to address semantic consistency, another under-explored topic. The method achieves state-of-the-art results across several benchmark datasets and reconstructs high-quality HDR images.
comment: Submitted to IEEE
Cutting-Edge Detection of Fatigue in Drivers: A Comparative Study of Object Detection Models
This research delves into the development of a fatigue detection system based on modern object detection algorithms, particularly YOLO (You Only Look Once) models, including YOLOv5, YOLOv6, YOLOv7, and YOLOv8. By comparing the performance of these models, we evaluate their effectiveness in real-time detection of fatigue-related behavior in drivers. The study addresses challenges like environmental variability and detection accuracy and suggests a roadmap for enhancing real-time detection. Experimental results demonstrate that YOLOv8 offers superior performance, balancing accuracy with speed. Data augmentation techniques and model optimization have been key in enhancing system adaptability to various driving conditions.
AutoFPDesigner: Automated Flight Procedure Design Based on Multi-Agent Large Language Model
Current flight procedure design methods heavily rely on human-led design process, which is not only low auto-mation but also suffer from complex algorithm modelling and poor generalization. To address these challenges, this paper proposes an agent-driven flight procedure design method based on large language model, named Au-toFPDesigner, which utilizes multi-agent collaboration to complete procedure design. The method enables end-to-end automated design of performance-based navigation (PBN) procedures. In this process, the user input the design requirements in natural language, AutoFPDesigner models the flight procedure design by loading the design speci-fications and utilizing tool libraries complete the design. AutoFPDesigner allows users to oversee and seamlessly participate in the design process. Experimental results show that AutoFPDesigner ensures nearly 100% safety in the designed flight procedures and achieves 75% task completion rate, with good adaptability across different design tasks. AutoFPDesigner introduces a new paradigm for flight procedure design and represents a key step towards the automation of this process. Keywords: Flight Procedure Design; Large Language Model; Performance-Based Navigation (PBN); Multi Agent;
comment: 21 pages, 18 figures, 5 tables
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation
Generalization in robotic manipulation remains a critical challenge, particularly when scaling to new environments with limited demonstrations. This paper introduces CAGE, a novel robotic manipulation policy designed to overcome these generalization barriers by integrating a causal attention mechanism. CAGE utilizes the powerful feature extraction capabilities of the vision foundation model DINOv2, combined with LoRA fine-tuning for robust environment understanding. The policy further employs a causal Perceiver for effective token compression and a diffusion-based action prediction head with attention mechanisms to enhance task-specific fine-grained conditioning. With as few as 50 demonstrations from a single training environment, CAGE achieves robust generalization across diverse visual changes in objects, backgrounds, and viewpoints. Extensive experiments validate that CAGE significantly outperforms existing state-of-the-art RGB/RGB-D approaches in various manipulation tasks, especially under large distribution shifts. In similar environments, CAGE offers an average of 42% increase in task completion rate. While all baselines fail to execute the task in unseen environments, CAGE manages to obtain a 43% completion rate and a 51% success rate in average, making a huge step towards practical deployment of robots in real-world settings. Project website: cage-policy.github.io.
MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning
Visual deep reinforcement learning (RL) enables robots to acquire skills from visual input for unstructured tasks. However, current algorithms suffer from low sample efficiency, limiting their practical applicability. In this work, we present MENTOR, a method that improves both the architecture and optimization of RL agents. Specifically, MENTOR replaces the standard multi-layer perceptron (MLP) with a mixture-of-experts (MoE) backbone, enhancing the agent's ability to handle complex tasks by leveraging modular expert learning to avoid gradient conflicts. Furthermore, MENTOR introduces a task-oriented perturbation mechanism, which heuristically samples perturbation candidates containing task-relevant information, leading to more targeted and effective optimization. MENTOR outperforms state-of-the-art methods across three simulation domains -- DeepMind Control Suite, Meta-World, and Adroit. Additionally, MENTOR achieves an average of 83% success rate on three challenging real-world robotic manipulation tasks including peg insertion, cable routing, and tabletop golf, which significantly surpasses the success rate of 32% from the current strongest model-free visual RL algorithm. These results underscore the importance of sample efficiency in advancing visual RL for real-world robotics. Experimental videos are available at https://suninghuang19.github.io/mentor_page.
AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks
This paper primarily focuses on learning robust visual-force policies in the context of high-precision object assembly tasks. Specifically, we focus on the contact phase of the assembly task where both objects (peg and hole) have made contact and the objective lies in maneuvering the objects to complete the assembly. Moreover, we aim to learn contact-rich manipulation policies with multisensory inputs on limited expert data by expanding human demonstrations via online data augmentation. We develop a simulation environment with a dual-arm robot manipulator to evaluate the effect of augmented expert demonstration data. Our focus is on evaluating the robustness of our model with respect to certain task variations: grasp pose, peg/hole shape, object body shape, scene appearance, camera pose, and force-torque/proprioception noise. We show that our proposed data augmentation method helps in learning a multisensory manipulation policy that is robust to unseen instances of these variations, particularly physical variations such as grasp pose. Additionally, our ablative studies show the significant contribution of force-torque data to the robustness of our model. For additional experiments and qualitative results, we refer to the project webpage at https://bit.ly/47skWXH .
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Offline-to-online reinforcement learning (O2O RL) aims to obtain a continually improving policy as it interacts with the environment, while ensuring the initial behaviour is satisficing. This satisficing behaviour is necessary for robotic manipulation where random exploration can be costly due to catastrophic failures and time. O2O RL is especially compelling when we can only obtain a scarce amount of (potentially suboptimal) demonstrations$\unicode{x2014}$a scenario where behavioural cloning (BC) is known to suffer from distribution shift. Previous works have outlined the challenges in applying O2O RL algorithms under the image-based environments. In this work, we propose a novel O2O RL algorithm that can learn in a real-life image-based robotic vacuum grasping task with a small number of demonstrations where BC fails majority of the time. The proposed algorithm replaces the target network in off-policy actor-critic algorithms with a regularization technique inspired by neural tangent kernel. We demonstrate that the proposed algorithm can reach above 90% success rate in under two hours of interaction time, with only 50 human demonstrations, while BC and two commonly-used RL algorithms fail to achieve similar performance.
Optimally Solving Colored Generalized Sliding-Tile Puzzles: Complexity and Bounds
The Generalized Sliding-Tile Puzzle (GSTP), allowing many square tiles on a board to move in parallel while enforcing natural geometric collision constraints on the movement of neighboring tiles, provide a high-fidelity mathematical model for many high-utility existing and future multi-robot applications, e.g., at mobile robot-based warehouses or autonomous garages. Motivated by practical relevance, this work examines a further generalization of GSTP called the Colored Generalized Sliding-Tile Puzzle (CGSP), where tiles can now assume varying degrees of distinguishability, a common occurrence in the aforementioned applications. Our study establishes the computational complexity of CGSP and its key sub-problems under a broad spectrum of possible conditions and characterizes solution makespan lower and upper bounds that differ by at most a logarithmic factor. These results are further extended to higher-dimensional versions of the puzzle game.
comment: WAFR 2024 Conference Version
Development of a Simple and Novel Digital Twin Framework for Industrial Robots in Intelligent robotics manufacturing
This paper has proposed an easily replicable and novel approach for developing a Digital Twin (DT) system for industrial robots in intelligent manufacturing applications. Our framework enables effective communication via Robot Web Service (RWS), while a real-time simulation is implemented in Unity 3D and Web-based Platform without any other 3rd party tools. The framework can do real-time visualization and control of the entire work process, as well as implement real-time path planning based on algorithms executed in MATLAB. Results verify the high communication efficiency with a refresh rate of only $17 ms$. Furthermore, our developed web-based platform and Graphical User Interface (GUI) enable easy accessibility and user-friendliness in real-time control.
A Novel Approach to Grasping Control of Soft Robotic Grippers based on Digital Twin
This paper has proposed a Digital Twin (DT) framework for real-time motion and pose control of soft robotic grippers. The developed DT is based on an industrial robot workstation, integrated with our newly proposed approach for soft gripper control, primarily based on computer vision, for setting the driving pressure for desired gripper status in real-time. Knowing the gripper motion, the gripper parameters (e.g. curvatures and bending angles, etc.) are simulated by kinematics modelling in Unity 3D, which is based on four-piecewise constant curvature kinematics. The mapping in between the driving pressure and gripper parameters is achieved by implementing OpenCV based image processing algorithms and data fitting. Results show that our DT-based approach can achieve satisfactory performance in real-time control of soft gripper manipulation, which can satisfy a wide range of industrial applications.
Cooperation and Fairness in Multi-Agent Reinforcement Learning
Multi-agent systems are trained to maximize shared cost objectives, which typically reflect system-level efficiency. However, in the resource-constrained environments of mobility and transportation systems, efficiency may be achieved at the expense of fairness -- certain agents may incur significantly greater costs or lower rewards compared to others. Tasks could be distributed inequitably, leading to some agents receiving an unfair advantage while others incur disproportionately high costs. It is important to consider the tradeoffs between efficiency and fairness. We consider the problem of fair multi-agent navigation for a group of decentralized agents using multi-agent reinforcement learning (MARL). We consider the reciprocal of the coefficient of variation of the distances traveled by different agents as a measure of fairness and investigate whether agents can learn to be fair without significantly sacrificing efficiency (i.e., increasing the total distance traveled). We find that by training agents using min-max fair distance goal assignments along with a reward term that incentivizes fairness as they move towards their goals, the agents (1) learn a fair assignment of goals and (2) achieve almost perfect goal coverage in navigation scenarios using only local observations. For goal coverage scenarios, we find that, on average, our model yields a 14% improvement in efficiency and a 5% improvement in fairness over a baseline trained using random assignments. Furthermore, an average of 21% improvement in fairness can be achieved compared to a model trained on optimally efficient assignments; this increase in fairness comes at the expense of only a 7% decrease in efficiency. Finally, we extend our method to environments in which agents must complete coverage tasks in prescribed formations and show that it is possible to do so without tailoring the models to specific formation shapes.
comment: Manuscript accepted in ACM Journal on Autonomous Transportation Systems
MindArm: Mechanized Intelligent Non-Invasive Neuro-Driven Prosthetic Arm System
Currently, individuals with arm mobility impairments (referred to as "patients") face limited technological solutions due to two key challenges: (1) non-invasive prosthetic devices are often prohibitively expensive and costly to maintain, and (2) invasive solutions require high-risk, costly brain surgery, which can pose a health risk. Therefore, current technological solutions are not accessible for all patients with different financial backgrounds. Toward this, we propose a low-cost technological solution called MindArm, an affordable, non-invasive neuro-driven prosthetic arm system. MindArm employs a deep neural network (DNN) to translate brain signals, captured by low-cost surface electroencephalogram (EEG) electrodes, into prosthetic arm movements. Utilizing an Open Brain Computer Interface and UDP networking for signal processing, the system seamlessly controls arm motion. In the compute module, we run a trained DNN model to interpret filtered micro-voltage brain signals, and then translate them into a prosthetic arm action via serial communication seamlessly. Experimental results from a fully functional prototype show high accuracy across three actions, with 91% for idle/stationary, 85% for handshake, and 84% for cup pickup. The system costs approximately $500-550, including $400 for the EEG headset and $100-150 for motors, 3D printing, and assembly, offering an affordable alternative for mind-controlled prosthetic devices.
comment: 8 pages, 22 figures, Paper accepted at ICARCV 2024, funded by CAIR
Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
comment: 6 pages, 8 figures
TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach
The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes used as subgoals, we first learn a teacher policy using privileged information. Then, we learn a student policy with point cloud observation by imitating teacher policy. Lastly, our pipeline applies learned policy to real-world execution. We demonstrate the effectiveness of TieBot in simulation and the real world. In the real-world experiment, a dual-arm robot successfully knots a tie, achieving 50% success rate among 10 trials. Videos can be found https://tiebots.github.io/.
comment: Accepted by CoRL 2024 as Oral presentation, camera-ready version
Collision-Free Robot Navigation in Crowded Environments using Learning based Convex Model Predictive Control
Navigating robots safely and efficiently in crowded and complex environments remains a significant challenge. However, due to the dynamic and intricate nature of these settings, planning efficient and collision-free paths for robots to track is particularly difficult. In this paper, we uniquely bridge the robot's perception, decision-making and control processes by utilizing the convex obstacle-free region computed from 2D LiDAR data. The overall pipeline is threefold: (1) We proposes a robot navigation framework that utilizes deep reinforcement learning (DRL), conceptualizing the observation as the convex obstacle-free region, a departure from general reliance on raw sensor inputs. (2) We design the action space, derived from the intersection of the robot's kinematic limits and the convex region, to enable efficient sampling of inherently collision-free reference points. These actions assists in guiding the robot to move towards the goal and interact with other obstacles during navigation. (3) We employ model predictive control (MPC) to track the trajectory formed by the reference points while satisfying constraints imposed by the convex obstacle-free region and the robot's kinodynamic limits. The effectiveness of proposed improvements has been validated through two sets of ablation studies and a comparative experiment against the Timed Elastic Band (TEB), demonstrating improved navigation performance in crowded and complex environments.
Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations
Recent advances in mapping techniques have enabled the creation of highly accurate dense 3D maps during robotic missions, such as point clouds, meshes, or NeRF-based representations. These developments present new opportunities for reusing these maps for localization. However, there remains a lack of a unified approach that can operate seamlessly across different map representations. This paper presents and evaluates a global visual localization system capable of localizing a single camera image across various 3D map representations built using both visual and lidar sensing. Our system generates a database by synthesizing novel views of the scene, creating RGB and depth image pairs. Leveraging the precise 3D geometric map, our method automatically defines rendering poses, reducing the number of database images while preserving retrieval performance. To bridge the domain gap between real query camera images and synthetic database images, our approach utilizes learning-based descriptors and feature detectors. We evaluate the system's performance through extensive real-world experiments conducted in both indoor and outdoor settings, assessing the effectiveness of each map representation and demonstrating its advantages over traditional structure-from-motion (SfM) localization approaches. The results show that all three map representations can achieve consistent localization success rates of 55% and higher across various environments. NeRF synthesized images show superior performance, localizing query images at an average success rate of 72%. Furthermore, we demonstrate an advantage over SfM-based approaches that our synthesized database enables localization in the reverse travel direction which is unseen during the mapping process. Our system, operating in real-time on a mobile laptop equipped with a GPU, achieves a processing rate of 1Hz.
Multi-Agent Reinforcement Learning for Connected and Automated Vehicles Control: Recent Advancements and Future Prospects
Connected and automated vehicles (CAVs) are considered a potential solution for future transportation challenges, aiming to develop systems that are efficient, safe, and environmentally friendly. However, CAV control presents significant challenges due to the complexity of interconnectivity and coordination required among vehicles. Multi-agent reinforcement learning (MARL), which has shown notable advancements in addressing complex problems in autonomous driving, robotics, and human-vehicle interaction, emerges as a promising tool to enhance CAV capabilities. Despite its potential, there is a notable absence of current reviews on mainstream MARL algorithms for CAVs. To fill this gap, this paper offers a comprehensive review of MARL's application in CAV control. The paper begins with an introduction to MARL, explaining its unique advantages in handling complex and multi-agent scenarios. It then presents a detailed survey of MARL applications across various control dimensions for CAVs, including critical scenarios such as platooning control, lane-changing, and unsignalized intersections. Additionally, the paper reviews prominent simulation platforms essential for developing and testing MARL algorithms. Lastly, it examines the current challenges in deploying MARL for CAV control, including macro-micro optimization, communication, mixed traffic, and sim-to-real challenges. Potential solutions discussed include hierarchical MARL, decentralized MARL, adaptive interactions, and offline MARL.
Automated Creation of Digital Cousins for Robust Policy Learning
Training robot policies in the real world can be unsafe, costly, and difficult to scale. Simulation serves as an inexpensive and potentially limitless source of training data, but suffers from the semantics and physics disparity between simulated and real-world environments. These discrepancies can be minimized by training in digital twins, which serve as virtual replicas of a real scene but are expensive to generate and cannot produce cross-domain generalization. To address these limitations, we propose the concept of digital cousins, a virtual asset or scene that, unlike a digital twin, does not explicitly model a real-world counterpart but still exhibits similar geometric and semantic affordances. As a result, digital cousins simultaneously reduce the cost of generating an analogous virtual environment while also facilitating better robustness during sim-to-real domain transfer by providing a distribution of similar training scenes. Leveraging digital cousins, we introduce a novel method for their automated creation, and propose a fully automated real-to-sim-to-real pipeline for generating fully interactive scenes and training robot policies that can be deployed zero-shot in the original scene. We find that digital cousin scenes that preserve geometric and semantic affordances can be produced automatically, and can be used to train policies that outperform policies trained on digital twins, achieving 90% vs. 25% success rates under zero-shot sim-to-real transfer. Additional details are available at https://digital-cousins.github.io/.
comment: CoRL 2024
Multiagent Systems
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Despite the popularity of multi-agent reinforcement learning (RL) in simulated and two-player applications, its success in messy real-world applications has been limited. A key challenge lies in its generalizability across problem variations, a common necessity for many real-world problems. Contextual reinforcement learning (CRL) formalizes learning policies that generalize across problem variations. However, the lack of standardized benchmarks for multi-agent CRL has hindered progress in the field. Such benchmarks are desired to be based on real-world applications to naturally capture the many open challenges of real-world problems that affect generalization. To bridge this gap, we propose IntersectionZoo, a comprehensive benchmark suite for multi-agent CRL through the real-world application of cooperative eco-driving in urban road networks. The task of cooperative eco-driving is to control a fleet of vehicles to reduce fleet-level vehicular emissions. By grounding IntersectionZoo in a real-world application, we naturally capture real-world problem characteristics, such as partial observability and multiple competing objectives. IntersectionZoo is built on data-informed simulations of 16,334 signalized intersections derived from 10 major US cities, modeled in an open-source industry-grade microscopic traffic simulator. By modeling factors affecting vehicular exhaust emissions (e.g., temperature, road conditions, travel demand), IntersectionZoo provides one million data-driven traffic scenarios. Using these traffic scenarios, we benchmark popular multi-agent RL and human-like driving algorithms and demonstrate that the popular multi-agent RL algorithms struggle to generalize in CRL settings.
comment: In review
DTPPO: Dual-Transformer Encoder-based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments
Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-based Proximal Policy Optimization (DTPPO) method. DTPPO enhances multi-UAV collaboration through a Spatial Transformer, which models inter-agent dynamics, and a Temporal Transformer, which captures temporal dependencies to improve generalization across diverse environments. This architecture allows UAVs to navigate new, unseen environments without retraining. Extensive simulations demonstrate that DTPPO outperforms current MADRL methods in terms of transferability, obstacle avoidance, and navigation efficiency across environments with varying obstacle densities. The results confirm DTPPO's effectiveness as a robust solution for multi-UAV navigation in both known and unseen scenarios.
Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost
This paper proposes an agent-based optimistic policy iteration (OPI) scheme for learning stationary optimal stochastic policies in multi-agent Markov Decision Processes (MDPs), in which agents incur a Kullback-Leibler (KL) divergence cost for their control efforts and an additional cost for the joint state. The proposed scheme consists of a greedy policy improvement step followed by an m-step temporal difference (TD) policy evaluation step. We use the separable structure of the instantaneous cost to show that the policy improvement step follows a Boltzmann distribution that depends on the current value function estimate and the uncontrolled transition probabilities. This allows agents to compute the improved joint policy independently. We show that both the synchronous (entire state space evaluation) and asynchronous (a uniformly sampled set of substates) versions of the OPI scheme with finite policy evaluation rollout converge to the optimal value function and an optimal joint policy asymptotically. Simulation results on a multi-agent MDP with KL control cost variant of the Stag-Hare game validates our scheme's performance in terms of minimizing the cost return.
Collaborative State Fusion in Partially Known Multi-agent Environments
In this paper, we study the collaborative state fusion problem in a multi-agent environment, where mobile agents collaborate to track movable targets. Due to the limited sensing range and potential errors of on-board sensors, it is necessary to aggregate individual observations to provide target state fusion for better target state estimation. Existing schemes do not perform well due to (1) impractical assumption of the fully known prior target state-space model and (2) observation outliers from individual sensors. To address the issues, we propose a two-stage collaborative fusion framework, namely \underline{L}earnable Weighted R\underline{o}bust \underline{F}usion (\textsf{LoF}). \textsf{LoF} combines a local state estimator (e.g., Kalman Filter) with a learnable weight generator to address the mismatch between the prior state-space model and underlying patterns of moving targets. Moreover, given observation outliers, we develop a time-series soft medoid(TSM) scheme to perform robust fusion. We evaluate \textsf{LoF} in a collaborative detection simulation environment with promising results. In an example setting with 4 agents and 2 targets, \textsf{LoF} leads to a 9.1\% higher fusion gain compared to the state-of-the-art.
Optimally Solving Colored Generalized Sliding-Tile Puzzles: Complexity and Bounds
The Generalized Sliding-Tile Puzzle (GSTP), allowing many square tiles on a board to move in parallel while enforcing natural geometric collision constraints on the movement of neighboring tiles, provide a high-fidelity mathematical model for many high-utility existing and future multi-robot applications, e.g., at mobile robot-based warehouses or autonomous garages. Motivated by practical relevance, this work examines a further generalization of GSTP called the Colored Generalized Sliding-Tile Puzzle (CGSP), where tiles can now assume varying degrees of distinguishability, a common occurrence in the aforementioned applications. Our study establishes the computational complexity of CGSP and its key sub-problems under a broad spectrum of possible conditions and characterizes solution makespan lower and upper bounds that differ by at most a logarithmic factor. These results are further extended to higher-dimensional versions of the puzzle game.
comment: WAFR 2024 Conference Version
Cooperation and Fairness in Multi-Agent Reinforcement Learning
Multi-agent systems are trained to maximize shared cost objectives, which typically reflect system-level efficiency. However, in the resource-constrained environments of mobility and transportation systems, efficiency may be achieved at the expense of fairness -- certain agents may incur significantly greater costs or lower rewards compared to others. Tasks could be distributed inequitably, leading to some agents receiving an unfair advantage while others incur disproportionately high costs. It is important to consider the tradeoffs between efficiency and fairness. We consider the problem of fair multi-agent navigation for a group of decentralized agents using multi-agent reinforcement learning (MARL). We consider the reciprocal of the coefficient of variation of the distances traveled by different agents as a measure of fairness and investigate whether agents can learn to be fair without significantly sacrificing efficiency (i.e., increasing the total distance traveled). We find that by training agents using min-max fair distance goal assignments along with a reward term that incentivizes fairness as they move towards their goals, the agents (1) learn a fair assignment of goals and (2) achieve almost perfect goal coverage in navigation scenarios using only local observations. For goal coverage scenarios, we find that, on average, our model yields a 14% improvement in efficiency and a 5% improvement in fairness over a baseline trained using random assignments. Furthermore, an average of 21% improvement in fairness can be achieved compared to a model trained on optimally efficient assignments; this increase in fairness comes at the expense of only a 7% decrease in efficiency. Finally, we extend our method to environments in which agents must complete coverage tasks in prescribed formations and show that it is possible to do so without tailoring the models to specific formation shapes.
comment: Manuscript accepted in ACM Journal on Autonomous Transportation Systems
Multi-Agent Reinforcement Learning for Connected and Automated Vehicles Control: Recent Advancements and Future Prospects
Connected and automated vehicles (CAVs) are considered a potential solution for future transportation challenges, aiming to develop systems that are efficient, safe, and environmentally friendly. However, CAV control presents significant challenges due to the complexity of interconnectivity and coordination required among vehicles. Multi-agent reinforcement learning (MARL), which has shown notable advancements in addressing complex problems in autonomous driving, robotics, and human-vehicle interaction, emerges as a promising tool to enhance CAV capabilities. Despite its potential, there is a notable absence of current reviews on mainstream MARL algorithms for CAVs. To fill this gap, this paper offers a comprehensive review of MARL's application in CAV control. The paper begins with an introduction to MARL, explaining its unique advantages in handling complex and multi-agent scenarios. It then presents a detailed survey of MARL applications across various control dimensions for CAVs, including critical scenarios such as platooning control, lane-changing, and unsignalized intersections. Additionally, the paper reviews prominent simulation platforms essential for developing and testing MARL algorithms. Lastly, it examines the current challenges in deploying MARL for CAV control, including macro-micro optimization, communication, mixed traffic, and sim-to-real challenges. Potential solutions discussed include hierarchical MARL, decentralized MARL, adaptive interactions, and offline MARL.
Systems and Control (CS)
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Despite the popularity of multi-agent reinforcement learning (RL) in simulated and two-player applications, its success in messy real-world applications has been limited. A key challenge lies in its generalizability across problem variations, a common necessity for many real-world problems. Contextual reinforcement learning (CRL) formalizes learning policies that generalize across problem variations. However, the lack of standardized benchmarks for multi-agent CRL has hindered progress in the field. Such benchmarks are desired to be based on real-world applications to naturally capture the many open challenges of real-world problems that affect generalization. To bridge this gap, we propose IntersectionZoo, a comprehensive benchmark suite for multi-agent CRL through the real-world application of cooperative eco-driving in urban road networks. The task of cooperative eco-driving is to control a fleet of vehicles to reduce fleet-level vehicular emissions. By grounding IntersectionZoo in a real-world application, we naturally capture real-world problem characteristics, such as partial observability and multiple competing objectives. IntersectionZoo is built on data-informed simulations of 16,334 signalized intersections derived from 10 major US cities, modeled in an open-source industry-grade microscopic traffic simulator. By modeling factors affecting vehicular exhaust emissions (e.g., temperature, road conditions, travel demand), IntersectionZoo provides one million data-driven traffic scenarios. Using these traffic scenarios, we benchmark popular multi-agent RL and human-like driving algorithms and demonstrate that the popular multi-agent RL algorithms struggle to generalize in CRL settings.
comment: In review
Relay Incentive Mechanisms Using Wireless Power Transfer in Non-Cooperative Networks
This paper studies the use of a multi-attribute auction in a communication system to bring about efficient relaying in a non-cooperative setting. We consider a system where a source seeks to offload data to an access point (AP) while balancing both the timeliness and energy-efficiency of the transmission. A deep fade in the communication channel (due to, e.g., a line-of-sight blockage) makes direct communication costly, and the source may alternatively rely on non-cooperative UEs to act as relays. We propose a multi-attribute auction to select a UE and to determine the duration and power of the transmission, with payments to the UE taking the form of energy sent via wireless power transfer (WPT). The quality of the channel from a UE to the AP constitutes private information, and bids consist of a transmission time and transmission power. We show that under a second-preferred-offer auction, truthful bidding by all candidate UEs forms a Nash Equilibrium. However, this auction is not incentive compatible, and we present a modified auction in which truthful bidding is in fact a dominant strategy. Extensive numerical experimentation illustrates the efficacy of our approach, which we compare to a cooperative baseline. We demonstrate that with as few as two candidates, our improved mechanism leads to as much as a 76% reduction in energy consumption, and that with as few as three candidates, the transmission time decreases by as much as 55%. Further, we see that as the number of candidates increases, the performance of our mechanism approaches that of the cooperative baseline. Overall, our findings highlight the potential of multi-attribute auctions to enhance the efficiency of data transfer in non-cooperative settings.
Enhancing Robot Navigation Policies with Task-Specific Uncertainty Management
Robots performing navigation tasks in complex environments face significant challenges due to uncertainty in state estimation. Effectively managing this uncertainty is crucial, but the optimal approach varies depending on the specific details of the task: different tasks require varying levels of precision in different regions of the environment. For instance, a robot navigating a crowded space might need precise localization near obstacles but can operate effectively with less precise state estimates in open areas. This varying need for certainty in different parts of the environment, depending on the task, calls for policies that can adapt their uncertainty management strategies based on task-specific requirements. In this paper, we present a framework for integrating task-specific uncertainty requirements directly into navigation policies. We introduce Task-Specific Uncertainty Map (TSUM), which represents acceptable levels of state estimation uncertainty across different regions of the operating environment for a given task. Using TSUM, we propose Generalized Uncertainty Integration for Decision-Making and Execution (GUIDE), a policy conditioning framework that incorporates these uncertainty requirements into the robot's decision-making process. We find that conditioning policies on TSUMs provides an effective way to express task-specific uncertainty requirements and enables the robot to reason about the context-dependent value of certainty. We show how integrating GUIDE into reinforcement learning frameworks allows the agent to learn navigation policies without the need for explicit reward engineering to balance task completion and uncertainty management. We evaluate GUIDE on a variety of real-world navigation tasks and find that it demonstrates significant improvements in task completion rates compared to baselines. Evaluation videos can be found at https://guided-agents.github.io.
Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost
This paper proposes an agent-based optimistic policy iteration (OPI) scheme for learning stationary optimal stochastic policies in multi-agent Markov Decision Processes (MDPs), in which agents incur a Kullback-Leibler (KL) divergence cost for their control efforts and an additional cost for the joint state. The proposed scheme consists of a greedy policy improvement step followed by an m-step temporal difference (TD) policy evaluation step. We use the separable structure of the instantaneous cost to show that the policy improvement step follows a Boltzmann distribution that depends on the current value function estimate and the uncontrolled transition probabilities. This allows agents to compute the improved joint policy independently. We show that both the synchronous (entire state space evaluation) and asynchronous (a uniformly sampled set of substates) versions of the OPI scheme with finite policy evaluation rollout converge to the optimal value function and an optimal joint policy asymptotically. Simulation results on a multi-agent MDP with KL control cost variant of the Stag-Hare game validates our scheme's performance in terms of minimizing the cost return.
A Comparative Analysis of Nigeria's Power Sector with and without Grid-Scale Storage: Future Implications for Emission and Renewable Energy Integration
This research proposes a framework for modeling and comparing two electricity scenarios for Nigeria by 2050, focusing on the inclusion and exclusion of electricity storage technologies. A Central Composite Design (CCD) was used to generate a design matrix for data collection, with EnergyPLAN software used to create energy system simulations on the CCD data for four outputs: total annual cost, CO2 emissions, critical excess electricity production (CEEP), and electricity import. Three machine learning algorithms, support vector regression (SVR), extreme gradient boosting (XGBoost), and multi-layer perceptron (MLP), were tuned using Bayesian optimization to develop models mapping the inputs to outputs. A genetic algorithm was employed for multi-objective optimization to determine the optimal input capacities that minimize the outputs. Results indicated that incorporating electricity storage technologies (EST) leads to a 37% increase in renewable electricity sources (RES) share, resulting in a 19.14% reduction in CO2 emissions. EST such as battery energy storage systems (BESS), pumped hydro storage (PHS), and vehicle-to-grid (V2G) storage allow for the storage of the critical excess electricity that comes with increasing RES share. Integrating EST in Nigeria's 2050 energy landscape is crucial for incorporating more renewable electricity sources into the energy system, thereby reducing CO2 emissions and managing excess electricity production. This study outlines a plan for optimal electricity production to meet Nigeria's 2050 demand, highlighting the need for a balanced approach that combines fossil fuels, renewable energy, nuclear power, and advanced storage solutions to achieve a sustainable and efficient electricity system.
comment: 41 Pages
Numerical optimal control for distributed delay differential equations: A simultaneous approach based on linearization of the delayed variables
Time delays are ubiquitous in industrial processes, and they must be accounted for when designing control algorithms because they have a significant effect on the process dynamics. Therefore, in this work, we propose a simultaneous approach for numerical optimal control of delay differential equations with distributed time delays. Specifically, we linearize the delayed variables around the current time, and we discretize the resulting implicit differential equations using Euler's implicit method. Furthermore, we transcribe the infinite-dimensional optimal control problem into a finite-dimensional nonlinear program, which we solve using Matlab's fmincon. Finally, we demonstrate the efficacy of the approach using a numerical example involving a molten salt nuclear fission reactor.
comment: 6 pages, 3 figures, 1 table
Design and Implementation of Hedge Algebra Controller using Recursive Semantic Values for Cart-pole System
This paper presents a novel approach to designing a Hedge Algebra Controller named Hedge Algebra Controller with Recursive Semantic Values (RS-HAC). This approach incorporates several newly introduced concepts, including Semantically Quantifying Simplified Mapping (SQSM) featuring a recursive algorithm, Infinite General Semantization (IGS), and Infinite General De-semantization (IGDS). These innovations aim to enhance the optimizability, scalability, and flexibility of hedge algebra theory, allowing the design of a hedge algebra-based controller to be carried out more efficiently and straightforward. An application of stabilizing an inverted pendulum on a cart is conducted to illustrate the superiority of the proposed approach. Comparisons are made between RS-HAC and a fuzzy controller of Takagi-Sugeno type (FC), as well as a linear quadratic regulator (LQR). The results indicate that the RS-HAC surpasses the FC by up to 400\% in control efficiency and is marginally better than the LQR regarding transient time in balancing an inverted pendulum on a cart.
EDRF: Enhanced Driving Risk Field Based on Multimodal Trajectory Prediction and Its Applications
Driving risk assessment is crucial for both autonomous vehicles and human-driven vehicles. The driving risk can be quantified as the product of the probability that an event (such as collision) will occur and the consequence of that event. However, the probability of events occurring is often difficult to predict due to the uncertainty of drivers' or vehicles' behavior. Traditional methods generally employ kinematic-based approaches to predict the future trajectories of entities, which often yield unrealistic prediction results. In this paper, the Enhanced Driving Risk Field (EDRF) model is proposed, integrating deep learning-based multimodal trajectory prediction results with Gaussian distribution models to quantitatively capture the uncertainty of traffic entities' behavior. The applications of the EDRF are also proposed. It is applied across various tasks (traffic risk monitoring, ego-vehicle risk analysis, and motion and trajectory planning) through the defined concept Interaction Risk (IR). Adequate example scenarios are provided for each application to illustrate the effectiveness of the model.
Optimizing Individualized Incentives from Grid Measurements and Limited Knowledge of Agent Behavior
As electrical generation becomes more distributed and volatile, and loads become more uncertain, controllability of distributed energy resources (DERs), regardless of their ownership status, will be necessary for grid reliability. Grid operators lack direct control over end-users' grid interactions, such as energy usage, but incentives can influence behavior -- for example, an end-user that receives a grid-driven incentive may adjust their consumption or expose relevant control variables in response. A key challenge in studying such incentives is the lack of data about human behavior, which usually motivates strong assumptions, such as distributional assumptions on compliance or rational utility-maximization. In this paper, we propose a general incentive mechanism in the form of a constrained optimization problem -- our approach is distinguished from prior work by modeling human behavior (e.g., reactions to an incentive) as an arbitrary unknown function. We propose feedback-based optimization algorithms to solve this problem that each leverage different amounts of information and/or measurements. We show that each converges to an asymptotically stable incentive with (near)-optimality guarantees given mild assumptions on the problem. Finally, we evaluate our proposed techniques in voltage regulation simulations on standard test beds. We test a variety of settings, including those that break assumptions required for theoretical convergence (e.g., convexity, smoothness) to capture realistic settings. In this evaluation, our proposed algorithms are able to find near-optimal incentives even when the reaction to an incentive is modeled by a theoretically difficult (yet realistic) function.
comment: 28 pages, 10 figures
Development of a Simple and Novel Digital Twin Framework for Industrial Robots in Intelligent robotics manufacturing
This paper has proposed an easily replicable and novel approach for developing a Digital Twin (DT) system for industrial robots in intelligent manufacturing applications. Our framework enables effective communication via Robot Web Service (RWS), while a real-time simulation is implemented in Unity 3D and Web-based Platform without any other 3rd party tools. The framework can do real-time visualization and control of the entire work process, as well as implement real-time path planning based on algorithms executed in MATLAB. Results verify the high communication efficiency with a refresh rate of only $17 ms$. Furthermore, our developed web-based platform and Graphical User Interface (GUI) enable easy accessibility and user-friendliness in real-time control.
A Novel Approach to Grasping Control of Soft Robotic Grippers based on Digital Twin
This paper has proposed a Digital Twin (DT) framework for real-time motion and pose control of soft robotic grippers. The developed DT is based on an industrial robot workstation, integrated with our newly proposed approach for soft gripper control, primarily based on computer vision, for setting the driving pressure for desired gripper status in real-time. Knowing the gripper motion, the gripper parameters (e.g. curvatures and bending angles, etc.) are simulated by kinematics modelling in Unity 3D, which is based on four-piecewise constant curvature kinematics. The mapping in between the driving pressure and gripper parameters is achieved by implementing OpenCV based image processing algorithms and data fitting. Results show that our DT-based approach can achieve satisfactory performance in real-time control of soft gripper manipulation, which can satisfy a wide range of industrial applications.
Integrating solid direct air capture systems with green hydrogen production: Economic synergy of sector coupling
In the global pursuit of sustainable energy solutions, mitigating carbon dioxide (CO2) emissions stands as a pivotal challenge. With escalating atmospheric CO2 levels, the imperative of direct air capture (DAC) systems becomes evident. Simultaneously, green hydrogen (GH) emerges as a pivotal medium for renewable energy. Nevertheless, the substantial expenses associated with these technologies impede widespread adoption, primarily due to significant installation costs and underutilized operational advantages when deployed independently. Integration through sector coupling enhances system efficiency and sustainability, while shared power sources and energy storage devices offer additional economic benefits. In this study, we assess the economic viability of polymer electrolyte membrane electrolyzers versus alkaline electrolyzers within the context of sector coupling. Our findings indicate that combining GH production with solid DAC systems yields significant economic advantages, with approximately a 10% improvement for PEM electrolyzers and a 20% enhancement for alkaline electrolyzers. These results highlight a substantial opportunity to improve the efficiency and economic viability of renewable energy and green hydrogen initiatives, thereby facilitating the broader adoption of cleaner technologies.
comment: We have corrected the errors from the previous version of the manuscript and uploaded the updated version
Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
comment: 6 pages, 8 figures
TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach
The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes used as subgoals, we first learn a teacher policy using privileged information. Then, we learn a student policy with point cloud observation by imitating teacher policy. Lastly, our pipeline applies learned policy to real-world execution. We demonstrate the effectiveness of TieBot in simulation and the real world. In the real-world experiment, a dual-arm robot successfully knots a tie, achieving 50% success rate among 10 trials. Videos can be found https://tiebots.github.io/.
comment: Accepted by CoRL 2024 as Oral presentation, camera-ready version
Numerical optimal control for delay differential equations: A simultaneous approach based on linearization of the delayed state
Time delays are ubiquitous in industry, and they must be accounted for when designing control strategies. However, numerical optimal control (NOC) of delay differential equations (DDEs) is challenging because it requires specialized discretization methods and the time delays may depend on the manipulated inputs or state variables. Therefore, in this work, we propose to linearize the delayed states around the current time. This results in a set of implicit differential equations, and we compare the steady states and the corresponding stability criteria of the DDEs and the approximate system. Furthermore, we propose a simultaneous approach for NOC of DDEs based on the linearization, and we discretize the approximate system using Euler's implicit method. Finally, we present a numerical example involving a molten salt nuclear fission reactor.
comment: 6 pages, 4 figures, submitted to a conference
SustainDC: Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Latency-Aware Resource Allocation for Mobile Edge Generation and Computing via Deep Reinforcement Learning
Recently, the integration of mobile edge computing (MEC) and generative artificial intelligence (GAI) technology has given rise to a new area called mobile edge generation and computing (MEGC), which offers mobile users heterogeneous services such as task computing and content generation. In this letter, we investigate the joint communication, computation, and the AIGC resource allocation problem in an MEGC system. A latency minimization problem is first formulated to enhance the quality of service for mobile users. Due to the strong coupling of the optimization variables, we propose a new deep reinforcement learning-based algorithm to solve it efficiently. Numerical results demonstrate that the proposed algorithm can achieve lower latency than two baseline algorithms.
comment: 5 pages, 6 figures. This paper has been accepted for publication by IEEE Networking Letters
Modeling Nonlinear Control Systems via Koopman Control Family: Universal Forms and Subspace Invariance Proximity
This paper introduces the Koopman Control Family (KCF), a mathematical framework for modeling general (not necessarily control-affine) discrete-time nonlinear control systems with the aim of providing a solid theoretical foundation for the use of Koopman-based methods in systems with inputs. We demonstrate that the concept of KCF captures the behavior of nonlinear control systems on a (potentially infinite-dimensional) function space. By employing a generalized notion of subspace invariance under the KCF, we establish a universal form for finite-dimensional models, which encompasses the commonly used linear, bilinear, and linear switched models as specific instances. In cases where the subspace is not invariant under the KCF, we propose a method for approximating models in general form and characterize the model's accuracy using the concept of invariance proximity. We end by discussing how the proposed framework naturally lends itself to data-driven modeling of control systems.
comment: 18 pages
Systems and Control (EESS)
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Despite the popularity of multi-agent reinforcement learning (RL) in simulated and two-player applications, its success in messy real-world applications has been limited. A key challenge lies in its generalizability across problem variations, a common necessity for many real-world problems. Contextual reinforcement learning (CRL) formalizes learning policies that generalize across problem variations. However, the lack of standardized benchmarks for multi-agent CRL has hindered progress in the field. Such benchmarks are desired to be based on real-world applications to naturally capture the many open challenges of real-world problems that affect generalization. To bridge this gap, we propose IntersectionZoo, a comprehensive benchmark suite for multi-agent CRL through the real-world application of cooperative eco-driving in urban road networks. The task of cooperative eco-driving is to control a fleet of vehicles to reduce fleet-level vehicular emissions. By grounding IntersectionZoo in a real-world application, we naturally capture real-world problem characteristics, such as partial observability and multiple competing objectives. IntersectionZoo is built on data-informed simulations of 16,334 signalized intersections derived from 10 major US cities, modeled in an open-source industry-grade microscopic traffic simulator. By modeling factors affecting vehicular exhaust emissions (e.g., temperature, road conditions, travel demand), IntersectionZoo provides one million data-driven traffic scenarios. Using these traffic scenarios, we benchmark popular multi-agent RL and human-like driving algorithms and demonstrate that the popular multi-agent RL algorithms struggle to generalize in CRL settings.
comment: In review
Relay Incentive Mechanisms Using Wireless Power Transfer in Non-Cooperative Networks
This paper studies the use of a multi-attribute auction in a communication system to bring about efficient relaying in a non-cooperative setting. We consider a system where a source seeks to offload data to an access point (AP) while balancing both the timeliness and energy-efficiency of the transmission. A deep fade in the communication channel (due to, e.g., a line-of-sight blockage) makes direct communication costly, and the source may alternatively rely on non-cooperative UEs to act as relays. We propose a multi-attribute auction to select a UE and to determine the duration and power of the transmission, with payments to the UE taking the form of energy sent via wireless power transfer (WPT). The quality of the channel from a UE to the AP constitutes private information, and bids consist of a transmission time and transmission power. We show that under a second-preferred-offer auction, truthful bidding by all candidate UEs forms a Nash Equilibrium. However, this auction is not incentive compatible, and we present a modified auction in which truthful bidding is in fact a dominant strategy. Extensive numerical experimentation illustrates the efficacy of our approach, which we compare to a cooperative baseline. We demonstrate that with as few as two candidates, our improved mechanism leads to as much as a 76% reduction in energy consumption, and that with as few as three candidates, the transmission time decreases by as much as 55%. Further, we see that as the number of candidates increases, the performance of our mechanism approaches that of the cooperative baseline. Overall, our findings highlight the potential of multi-attribute auctions to enhance the efficiency of data transfer in non-cooperative settings.
Enhancing Robot Navigation Policies with Task-Specific Uncertainty Management
Robots performing navigation tasks in complex environments face significant challenges due to uncertainty in state estimation. Effectively managing this uncertainty is crucial, but the optimal approach varies depending on the specific details of the task: different tasks require varying levels of precision in different regions of the environment. For instance, a robot navigating a crowded space might need precise localization near obstacles but can operate effectively with less precise state estimates in open areas. This varying need for certainty in different parts of the environment, depending on the task, calls for policies that can adapt their uncertainty management strategies based on task-specific requirements. In this paper, we present a framework for integrating task-specific uncertainty requirements directly into navigation policies. We introduce Task-Specific Uncertainty Map (TSUM), which represents acceptable levels of state estimation uncertainty across different regions of the operating environment for a given task. Using TSUM, we propose Generalized Uncertainty Integration for Decision-Making and Execution (GUIDE), a policy conditioning framework that incorporates these uncertainty requirements into the robot's decision-making process. We find that conditioning policies on TSUMs provides an effective way to express task-specific uncertainty requirements and enables the robot to reason about the context-dependent value of certainty. We show how integrating GUIDE into reinforcement learning frameworks allows the agent to learn navigation policies without the need for explicit reward engineering to balance task completion and uncertainty management. We evaluate GUIDE on a variety of real-world navigation tasks and find that it demonstrates significant improvements in task completion rates compared to baselines. Evaluation videos can be found at https://guided-agents.github.io.
Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost
This paper proposes an agent-based optimistic policy iteration (OPI) scheme for learning stationary optimal stochastic policies in multi-agent Markov Decision Processes (MDPs), in which agents incur a Kullback-Leibler (KL) divergence cost for their control efforts and an additional cost for the joint state. The proposed scheme consists of a greedy policy improvement step followed by an m-step temporal difference (TD) policy evaluation step. We use the separable structure of the instantaneous cost to show that the policy improvement step follows a Boltzmann distribution that depends on the current value function estimate and the uncontrolled transition probabilities. This allows agents to compute the improved joint policy independently. We show that both the synchronous (entire state space evaluation) and asynchronous (a uniformly sampled set of substates) versions of the OPI scheme with finite policy evaluation rollout converge to the optimal value function and an optimal joint policy asymptotically. Simulation results on a multi-agent MDP with KL control cost variant of the Stag-Hare game validates our scheme's performance in terms of minimizing the cost return.
A Comparative Analysis of Nigeria's Power Sector with and without Grid-Scale Storage: Future Implications for Emission and Renewable Energy Integration
This research proposes a framework for modeling and comparing two electricity scenarios for Nigeria by 2050, focusing on the inclusion and exclusion of electricity storage technologies. A Central Composite Design (CCD) was used to generate a design matrix for data collection, with EnergyPLAN software used to create energy system simulations on the CCD data for four outputs: total annual cost, CO2 emissions, critical excess electricity production (CEEP), and electricity import. Three machine learning algorithms, support vector regression (SVR), extreme gradient boosting (XGBoost), and multi-layer perceptron (MLP), were tuned using Bayesian optimization to develop models mapping the inputs to outputs. A genetic algorithm was employed for multi-objective optimization to determine the optimal input capacities that minimize the outputs. Results indicated that incorporating electricity storage technologies (EST) leads to a 37% increase in renewable electricity sources (RES) share, resulting in a 19.14% reduction in CO2 emissions. EST such as battery energy storage systems (BESS), pumped hydro storage (PHS), and vehicle-to-grid (V2G) storage allow for the storage of the critical excess electricity that comes with increasing RES share. Integrating EST in Nigeria's 2050 energy landscape is crucial for incorporating more renewable electricity sources into the energy system, thereby reducing CO2 emissions and managing excess electricity production. This study outlines a plan for optimal electricity production to meet Nigeria's 2050 demand, highlighting the need for a balanced approach that combines fossil fuels, renewable energy, nuclear power, and advanced storage solutions to achieve a sustainable and efficient electricity system.
comment: 41 Pages
Numerical optimal control for distributed delay differential equations: A simultaneous approach based on linearization of the delayed variables
Time delays are ubiquitous in industrial processes, and they must be accounted for when designing control algorithms because they have a significant effect on the process dynamics. Therefore, in this work, we propose a simultaneous approach for numerical optimal control of delay differential equations with distributed time delays. Specifically, we linearize the delayed variables around the current time, and we discretize the resulting implicit differential equations using Euler's implicit method. Furthermore, we transcribe the infinite-dimensional optimal control problem into a finite-dimensional nonlinear program, which we solve using Matlab's fmincon. Finally, we demonstrate the efficacy of the approach using a numerical example involving a molten salt nuclear fission reactor.
comment: 6 pages, 3 figures, 1 table
Design and Implementation of Hedge Algebra Controller using Recursive Semantic Values for Cart-pole System
This paper presents a novel approach to designing a Hedge Algebra Controller named Hedge Algebra Controller with Recursive Semantic Values (RS-HAC). This approach incorporates several newly introduced concepts, including Semantically Quantifying Simplified Mapping (SQSM) featuring a recursive algorithm, Infinite General Semantization (IGS), and Infinite General De-semantization (IGDS). These innovations aim to enhance the optimizability, scalability, and flexibility of hedge algebra theory, allowing the design of a hedge algebra-based controller to be carried out more efficiently and straightforward. An application of stabilizing an inverted pendulum on a cart is conducted to illustrate the superiority of the proposed approach. Comparisons are made between RS-HAC and a fuzzy controller of Takagi-Sugeno type (FC), as well as a linear quadratic regulator (LQR). The results indicate that the RS-HAC surpasses the FC by up to 400\% in control efficiency and is marginally better than the LQR regarding transient time in balancing an inverted pendulum on a cart.
EDRF: Enhanced Driving Risk Field Based on Multimodal Trajectory Prediction and Its Applications
Driving risk assessment is crucial for both autonomous vehicles and human-driven vehicles. The driving risk can be quantified as the product of the probability that an event (such as collision) will occur and the consequence of that event. However, the probability of events occurring is often difficult to predict due to the uncertainty of drivers' or vehicles' behavior. Traditional methods generally employ kinematic-based approaches to predict the future trajectories of entities, which often yield unrealistic prediction results. In this paper, the Enhanced Driving Risk Field (EDRF) model is proposed, integrating deep learning-based multimodal trajectory prediction results with Gaussian distribution models to quantitatively capture the uncertainty of traffic entities' behavior. The applications of the EDRF are also proposed. It is applied across various tasks (traffic risk monitoring, ego-vehicle risk analysis, and motion and trajectory planning) through the defined concept Interaction Risk (IR). Adequate example scenarios are provided for each application to illustrate the effectiveness of the model.
Optimizing Individualized Incentives from Grid Measurements and Limited Knowledge of Agent Behavior
As electrical generation becomes more distributed and volatile, and loads become more uncertain, controllability of distributed energy resources (DERs), regardless of their ownership status, will be necessary for grid reliability. Grid operators lack direct control over end-users' grid interactions, such as energy usage, but incentives can influence behavior -- for example, an end-user that receives a grid-driven incentive may adjust their consumption or expose relevant control variables in response. A key challenge in studying such incentives is the lack of data about human behavior, which usually motivates strong assumptions, such as distributional assumptions on compliance or rational utility-maximization. In this paper, we propose a general incentive mechanism in the form of a constrained optimization problem -- our approach is distinguished from prior work by modeling human behavior (e.g., reactions to an incentive) as an arbitrary unknown function. We propose feedback-based optimization algorithms to solve this problem that each leverage different amounts of information and/or measurements. We show that each converges to an asymptotically stable incentive with (near)-optimality guarantees given mild assumptions on the problem. Finally, we evaluate our proposed techniques in voltage regulation simulations on standard test beds. We test a variety of settings, including those that break assumptions required for theoretical convergence (e.g., convexity, smoothness) to capture realistic settings. In this evaluation, our proposed algorithms are able to find near-optimal incentives even when the reaction to an incentive is modeled by a theoretically difficult (yet realistic) function.
comment: 28 pages, 10 figures
Development of a Simple and Novel Digital Twin Framework for Industrial Robots in Intelligent robotics manufacturing
This paper has proposed an easily replicable and novel approach for developing a Digital Twin (DT) system for industrial robots in intelligent manufacturing applications. Our framework enables effective communication via Robot Web Service (RWS), while a real-time simulation is implemented in Unity 3D and Web-based Platform without any other 3rd party tools. The framework can do real-time visualization and control of the entire work process, as well as implement real-time path planning based on algorithms executed in MATLAB. Results verify the high communication efficiency with a refresh rate of only $17 ms$. Furthermore, our developed web-based platform and Graphical User Interface (GUI) enable easy accessibility and user-friendliness in real-time control.
A Novel Approach to Grasping Control of Soft Robotic Grippers based on Digital Twin
This paper has proposed a Digital Twin (DT) framework for real-time motion and pose control of soft robotic grippers. The developed DT is based on an industrial robot workstation, integrated with our newly proposed approach for soft gripper control, primarily based on computer vision, for setting the driving pressure for desired gripper status in real-time. Knowing the gripper motion, the gripper parameters (e.g. curvatures and bending angles, etc.) are simulated by kinematics modelling in Unity 3D, which is based on four-piecewise constant curvature kinematics. The mapping in between the driving pressure and gripper parameters is achieved by implementing OpenCV based image processing algorithms and data fitting. Results show that our DT-based approach can achieve satisfactory performance in real-time control of soft gripper manipulation, which can satisfy a wide range of industrial applications.
Integrating solid direct air capture systems with green hydrogen production: Economic synergy of sector coupling
In the global pursuit of sustainable energy solutions, mitigating carbon dioxide (CO2) emissions stands as a pivotal challenge. With escalating atmospheric CO2 levels, the imperative of direct air capture (DAC) systems becomes evident. Simultaneously, green hydrogen (GH) emerges as a pivotal medium for renewable energy. Nevertheless, the substantial expenses associated with these technologies impede widespread adoption, primarily due to significant installation costs and underutilized operational advantages when deployed independently. Integration through sector coupling enhances system efficiency and sustainability, while shared power sources and energy storage devices offer additional economic benefits. In this study, we assess the economic viability of polymer electrolyte membrane electrolyzers versus alkaline electrolyzers within the context of sector coupling. Our findings indicate that combining GH production with solid DAC systems yields significant economic advantages, with approximately a 10% improvement for PEM electrolyzers and a 20% enhancement for alkaline electrolyzers. These results highlight a substantial opportunity to improve the efficiency and economic viability of renewable energy and green hydrogen initiatives, thereby facilitating the broader adoption of cleaner technologies.
comment: We have corrected the errors from the previous version of the manuscript and uploaded the updated version
Developing Path Planning with Behavioral Cloning and Proximal Policy Optimization for Path-Tracking and Static Obstacle Nudging
In autonomous driving, end-to-end methods utilizing Imitation Learning (IL) and Reinforcement Learning (RL) are becoming more and more common. However, they do not involve explicit reasoning like classic robotics workflow and planning with horizons, resulting in strategies implicit and myopic. In this paper, we introduce a path planning method that uses Behavioral Cloning (BC) for path-tracking and Proximal Policy Optimization (PPO) for static obstacle nudging. It outputs lateral offset values to adjust the given reference waypoints and performs modified path for different controllers. Experimental results show that the algorithm can do path following that mimics the expert performance of path-tracking controllers, and avoid collision to fixed obstacles. The method makes a good attempt at planning with learning-based methods in path planning problems of autonomous driving.
comment: 6 pages, 8 figures
TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach
The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes used as subgoals, we first learn a teacher policy using privileged information. Then, we learn a student policy with point cloud observation by imitating teacher policy. Lastly, our pipeline applies learned policy to real-world execution. We demonstrate the effectiveness of TieBot in simulation and the real world. In the real-world experiment, a dual-arm robot successfully knots a tie, achieving 50% success rate among 10 trials. Videos can be found https://tiebots.github.io/.
comment: Accepted by CoRL 2024 as Oral presentation, camera-ready version
Numerical optimal control for delay differential equations: A simultaneous approach based on linearization of the delayed state
Time delays are ubiquitous in industry, and they must be accounted for when designing control strategies. However, numerical optimal control (NOC) of delay differential equations (DDEs) is challenging because it requires specialized discretization methods and the time delays may depend on the manipulated inputs or state variables. Therefore, in this work, we propose to linearize the delayed states around the current time. This results in a set of implicit differential equations, and we compare the steady states and the corresponding stability criteria of the DDEs and the approximate system. Furthermore, we propose a simultaneous approach for NOC of DDEs based on the linearization, and we discretize the approximate system using Euler's implicit method. Finally, we present a numerical example involving a molten salt nuclear fission reactor.
comment: 6 pages, 4 figures, submitted to a conference
SustainDC: Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Latency-Aware Resource Allocation for Mobile Edge Generation and Computing via Deep Reinforcement Learning
Recently, the integration of mobile edge computing (MEC) and generative artificial intelligence (GAI) technology has given rise to a new area called mobile edge generation and computing (MEGC), which offers mobile users heterogeneous services such as task computing and content generation. In this letter, we investigate the joint communication, computation, and the AIGC resource allocation problem in an MEGC system. A latency minimization problem is first formulated to enhance the quality of service for mobile users. Due to the strong coupling of the optimization variables, we propose a new deep reinforcement learning-based algorithm to solve it efficiently. Numerical results demonstrate that the proposed algorithm can achieve lower latency than two baseline algorithms.
comment: 5 pages, 6 figures. This paper has been accepted for publication by IEEE Networking Letters
Modeling Nonlinear Control Systems via Koopman Control Family: Universal Forms and Subspace Invariance Proximity
This paper introduces the Koopman Control Family (KCF), a mathematical framework for modeling general (not necessarily control-affine) discrete-time nonlinear control systems with the aim of providing a solid theoretical foundation for the use of Koopman-based methods in systems with inputs. We demonstrate that the concept of KCF captures the behavior of nonlinear control systems on a (potentially infinite-dimensional) function space. By employing a generalized notion of subspace invariance under the KCF, we establish a universal form for finite-dimensional models, which encompasses the commonly used linear, bilinear, and linear switched models as specific instances. In cases where the subspace is not invariant under the KCF, we propose a method for approximating models in general form and characterize the model's accuracy using the concept of invariance proximity. We end by discussing how the proposed framework naturally lends itself to data-driven modeling of control systems.
comment: 18 pages
Robotics
Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor Environments
Deep Reinforcement learning (DRL) is used to enable autonomous navigation in unknown environments. Most research assume perfect sensor data, but real-world environments may contain natural and artificial sensor noise and denial. Here, we present a benchmark of both well-used and emerging DRL algorithms in a navigation task with configurable sensor denial effects. In particular, we are interested in comparing how different DRL methods (e.g. model-free PPO vs. model-based DreamerV3) are affected by sensor denial. We show that DreamerV3 outperforms other methods in the visual end-to-end navigation task with a dynamic goal - and other methods are not able to learn this. Furthermore, DreamerV3 generally outperforms other methods in sensor-denied environments. In order to improve robustness, we use adversarial training and demonstrate an improved performance in denied environments, although this generally comes with a performance cost on the vanilla environments. We anticipate this benchmark of different DRL methods and the usage of adversarial training to be a starting point for the development of more elaborate navigation strategies that are capable of dealing with uncertain and denied sensor readings.
comment: 31 pages, 19 figures. For associated code, see https://github.com/mazqtpopx/cranfield-navigation-gym
Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion
Autonomous surgical robots have demonstrated significant potential to standardize surgical outcomes, driving innovations that enhance safety and consistency regardless of individual surgeon experience. Deep anterior lamellar keratoplasty (DALK), a partial thickness corneal transplant surgery aimed at replacing the anterior part of cornea above Descemet membrane (DM), would greatly benefit from an autonomous surgical approach as it highly relies on surgeon skill with high perforation rates. In this study, we proposed a novel autonomous surgical robotic system (AUTO-DALK) based on a customized neural network capable of precise needle control and consistent big bubble demarcation on cadaver and live rabbit models. We demonstrate the feasibility of an AI-based image-guided vertical drilling approach for big bubble generation, in contrast to the conventional horizontal needle approach. Our system integrates an optical coherence tomography (OCT) fiber optic distal sensor into the eye-mountable micro robotic system, which automatically segments OCT M-mode depth signals to identify corneal layers using a custom deep learning algorithm. It enables the robot to autonomously guide the needle to targeted tissue layers via a depth-controlled feedback loop. We compared autonomous needle insertion performance and resulting pneumo-dissection using AUTO-DALK against 1) freehand insertion, 2) OCT sensor guided manual insertion, and 3) teleoperated robotic insertion, reporting significant improvements in insertion depth, pneumo-dissection depth, task completion time, and big bubble formation. Ex vivo and in vivo results indicate that the AI-driven, AUTO-DALK system, is a promising solution to standardize pneumo-dissection outcomes for partial thickness keratoplasty.
Graph Optimality-Aware Stochastic LiDAR Bundle Adjustment with Progressive Spatial Smoothing
Large-scale LiDAR Bundle Adjustment (LBA) for refining sensor orientation and point cloud accuracy simultaneously is a fundamental task in photogrammetry and robotics, particularly as low-cost 3D sensors are increasingly used for 3D mapping in complex scenes. Unlike pose-graph-based methods that rely solely on pairwise relationships between LiDAR frames, LBA leverages raw LiDAR correspondences to achieve more precise results, especially when initial pose estimates are unreliable for low-cost sensors. However, existing LBA methods face challenges such as simplistic planar correspondences, extensive observations, and dense normal matrices in the least-squares problem, which limit robustness, efficiency, and scalability. To address these issues, we propose a Graph Optimality-aware Stochastic Optimization scheme with Progressive Spatial Smoothing, namely PSS-GOSO, to achieve \textit{robust}, \textit{efficient}, and \textit{scalable} LBA. The Progressive Spatial Smoothing (PSS) module extracts \textit{robust} LiDAR feature association exploiting the prior structure information obtained by the polynomial smooth kernel. The Graph Optimality-aware Stochastic Optimization (GOSO) module first sparsifies the graph according to optimality for an \textit{efficient} optimization. GOSO then utilizes stochastic clustering and graph marginalization to solve the large-scale state estimation problem for a \textit{scalable} LBA. We validate PSS-GOSO across diverse scenes captured by various platforms, demonstrating its superior performance compared to existing methods.
Domain Adaptive Safety Filters via Deep Operator Learning
Learning-based approaches for constructing Control Barrier Functions (CBFs) are increasingly being explored for safety-critical control systems. However, these methods typically require complete retraining when applied to unseen environments, limiting their adaptability. To address this, we propose a self-supervised deep operator learning framework that learns the mapping from environmental parameters to the corresponding CBF, rather than learning the CBF directly. Our approach leverages the residual of a parametric Partial Differential Equation (PDE), where the solution defines a parametric CBF approximating the maximal control invariant set. This framework accommodates complex safety constraints, higher relative degrees, and actuation limits. We demonstrate the effectiveness of the method through numerical experiments on navigation tasks involving dynamic obstacles.
comment: 63rd IEEE Conference on Decision and Control (CDC)
From Simple to Complex: Knowledge Transfer in Safe and Efficient Reinforcement Learning for Autonomous Driving
A safe and efficient decision-making system is crucial for autonomous vehicles. However, the complexity of driving environments limit the effectiveness of many rule-based and machine learning-based decision-making approaches. The introduction of Reinforcement Learning in autonomous driving presents a promising solution to these challenges, although concerns about safety and efficiency during training remain major obstacles to its widespread application. To address these concerns, we propose a novel framework named Simple to Complex Collaborative Decision. First, we rapidly train the teacher model using the Proximal Policy Optimization algorithm in a lightweight autonomous driving simulation environment. In the more complex simulation environment, the teacher model intervenes when the student agent exhibits sub-optimal behavior by assessing the value of actions to avert dangerous situations. Next, we developed an innovative algorithm called Adaptive Clipping Proximal Policy Optimization. It trains using a combination of samples generated by both the teacher and student policies and applies dynamic clipping strategies based on sample importance, enabling the algorithm to utilize samples from diverse sources more efficiently. Additionally, we employ the KL divergence between the teacher's and student's policies as a constraint for policy optimization to facilitate the student agent's rapid learning of the teacher's policy. Finally, by adopting an appropriate weaning strategy to gradually reduce teacher intervention, we ensure that the student agent can fully explore the environment independently during the later stages of training. Simulation experiments in highway lane-change scenarios demonstrate that, compared to baseline algorithms, our proposed framework not only improves learning efficiency and reduces training costs but also significantly enhances safety during training.
Sim2real Cattle Joint Estimation in 3D point clouds
Understanding the well-being of cattle is crucial in various agricultural contexts. Cattle's body shape and joint articulation carry significant information about their welfare, yet acquiring comprehensive datasets for 3D body pose estimation presents a formidable challenge. This study delves into the construction of such a dataset specifically tailored for cattle. Leveraging the expertise of digital artists, we use a single animated 3D model to represent diverse cattle postures. To address the disparity between virtual and real-world data, we augment the 3D model's shape to encompass a range of potential body appearances, thereby narrowing the "sim2real" gap. We use these annotated models to train a deep-learning framework capable of estimating internal joints solely based on external surface curvature. Our contribution is specifically the use of geodesic distance over the surface manifold, coupled with multilateration to extract joints in a semantic keypoint detection encoder-decoder architecture. We demonstrate the robustness of joint extraction by comparing the link lengths extracted on real cattle mobbing and walking within a race. Furthermore, inspired by the established allometric relationship between bone length and the overall height of mammals, we utilise the estimated joints to predict hip height within a real cattle dataset, extending the utility of our approach to offer insights into improving cattle monitoring practices.
Formation Control for Moving Target Enclosing and Tracking via Relative Localization
This paper proposes an integrated framework for coordinating multiple unmanned aerial vehicles (UAVs) in a distributed fashion to persistently enclose and track a moving target without external localization systems. It is assumed that the UAV can obtain self-displacement and the target's relative position using vision-based methods within its local frame. Additionally, UAVs can measure relative distances and communicate with each other, e.g. by ultrawideband (UWB) sensors. Due to the absence of a global coordinate system, measurements from neighbors cannot be directly utilized for collaborative estimation of the target state. To address this, a recursive least squares estimator (RLSE) for estimating the relative positions between UAVs is integrated into a distributed Kalman filter (DKF), enabling a persistent estimation of the target state. When the UAV loses direct measurements of the target due to environmental occlusion, measurements from neighbors will be aligned into the UAV's local frame to provide indirect measurements. Furthermore, simultaneously ensuring the convergence of the estimators and maintaining effective target tracking is a significant challenge. To tackle this problem, a consensus-based formation controller with bounded inputs is developed by integrating a coupled oscillator-based circular formation design. Theoretical analysis shows that the proposed framework ensures asymptotic tracking of a target with constant velocity. For a target with varying velocity, the tracking error converges to a bounded region related to the target's maximum acceleration. Simulations and experiments validate the effectiveness of the proposed algorithm.
comment: 13 Pages
On the Benefits of Robot Platooning for Navigating Crowded Environments
This paper studies how groups of robots can effectively navigate through a crowd of agents. It quantifies the performance of platooning and less constrained, greedy strategies, and the extent to which these strategies disrupt the crowd agents. Three scenarios are considered: (i) passive crowds, (ii) counter-flow crowds, and (iii) perpendicular-flow crowds. Through simulations consisting of up to 200 robots, we show that for navigating passive and counter-flow crowds, the platooning strategy is less disruptive and more effective in dense crowds than the greedy strategy, whereas for navigating perpendicular-flow crowds, the greedy strategy outperforms the platooning strategy in either aspect. Moreover, we propose an adaptive strategy that can switch between platooning and greedy behavioral states, and demonstrate that it combines the strengths of both strategies in all the scenarios considered.
comment: 14 pages, 7 figures, to be published in DARS 2024
MARLIN: Multi-Agent Reinforcement Learning Guided by Language-Based Inter-Robot Negotiation
Multi-agent reinforcement learning is a key method for training multi-robot systems over a series of episodes in which robots are rewarded or punished according to their performance; only once the system is trained to a suitable standard is it deployed in the real world. If the system is not trained enough, the task will likely not be completed and could pose a risk to the surrounding environment. Therefore, reaching high performance in a shorter training period can lead to significant reductions in time and resource consumption. We introduce Multi-Agent Reinforcement Learning guided by Language-based Inter-Robot Negotiation (MARLIN), which makes the training process both faster and more transparent. We equip robots with large language models that negotiate and debate the task, producing a plan that is used to guide the policy during training. We dynamically switch between using reinforcement learning and the negotiation-based approach throughout training. This offers an increase in training speed when compared to standard multi-agent reinforcement learning and allows the system to be deployed to physical hardware earlier. As robots negotiate in natural language, we can better understand the behaviour of the robots individually and as a collective. We compare the performance of our approach to multi-agent reinforcement learning and a large language model to show that our hybrid method trains faster at little cost to performance.
CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic
The integration of autonomous vehicles into urban traffic has great potential to improve efficiency by reducing congestion and optimizing traffic flow systematically. In this paper, we introduce CoMAL (Collaborative Multi-Agent LLMs), a framework designed to address the mixed-autonomy traffic problem by collaboration among autonomous vehicles to optimize traffic flow. CoMAL is built upon large language models, operating in an interactive traffic simulation environment. It utilizes a Perception Module to observe surrounding agents and a Memory Module to store strategies for each agent. The overall workflow includes a Collaboration Module that encourages autonomous vehicles to discuss the effective strategy and allocate roles, a reasoning engine to determine optimal behaviors based on assigned roles, and an Execution Module that controls vehicle actions using a hybrid approach combining rule-based models. Experimental results demonstrate that CoMAL achieves superior performance on the Flow benchmark. Additionally, we evaluate the impact of different language models and compare our framework with reinforcement learning approaches. It highlights the strong cooperative capability of LLM agents and presents a promising solution to the mixed-autonomy traffic challenge. The code is available at https://github.com/Hyan-Yao/CoMAL.
Quadrotor Guidance for Window Traversal: A Bearings-Only Approach
This paper focuses on developing a bearings-only measurement-based three-dimensional window traversal guidance method for quadrotor Uninhabitated Aerial Vehicles (UAVs). The desired flight path and heading angles of the quadrotor are proposed as functions of the bearing angle information of the four vertices of the window. These angular guidance inputs employ a bearing angle bisector term and an elliptic shaping angle term, which directs the quadrotor towards the centroid of the window. Detailed stability analysis of the resulting kinematics demonstrates that all quadrotor trajectories lead to the centroid of the window along a direction which is normal to the window plane. A qualitative comparison with existing traversal methodologies showcases the superiority of the proposed guidance approach with regard to the nature of information, computations for generating the guidance commands, and flexibility of replanning the traversal path. Realistic simulations considering six degree-of-freedom quadrotor model and Monte Carlo studies validate the effectiveness, accuracy, and robustness of the proposed guidance solution. Representative flight validation trials are carried out using an indoor motion capture system.
Perception of Emotions in Human and Robot Faces: Is the Eye Region Enough?
The increased interest in developing next-gen social robots has raised questions about the factors affecting the perception of robot emotions. This study investigates the impact of robot appearances (humanlike, mechanical) and face regions (full-face, eye-region) on human perception of robot emotions. A between-subjects user study (N = 305) was conducted where participants were asked to identify the emotions being displayed in videos of robot faces, as well as a human baseline. Our findings reveal three important insights for effective social robot face design in Human-Robot Interaction (HRI): Firstly, robots equipped with a back-projected, fully animated face - regardless of whether they are more human-like or more mechanical-looking - demonstrate a capacity for emotional expression comparable to that of humans. Secondly, the recognition accuracy of emotional expressions in both humans and robots declines when only the eye region is visible. Lastly, within the constraint of only the eye region being visible, robots with more human-like features significantly enhance emotion recognition.
comment: Accepted for publication at the 16th International Conference on Social Robotics, Odense, Denmark (ICSR 2024)
Transferring Tactile Data Across Sensors ICRA
Tactile perception is essential for human interaction with the environment and is becoming increasingly crucial in robotics. Tactile sensors like the BioTac mimic human fingertips and provide detailed interaction data. Despite its utility in applications like slip detection and object identification, this sensor is now deprecated, making many existing datasets obsolete. This article introduces a novel method for translating data between tactile sensors by exploiting sensor deformation information rather than output signals. We demonstrate the approach by translating BioTac signals into the DIGIT sensor. Our framework consists of three steps: first, converting signal data into corresponding 3D deformation meshes; second, translating these 3D deformation meshes from one sensor to another; and third, generating output images using the converted meshes. Our approach enables the continued use of valuable datasets.
comment: Extended Abstract. Accepted in ICRA@40 (40th Anniversary of the IEEE International Conference on Robotics and Automation) 23-26 September, 2024 Rotterdam, Netherlands
Optimizing Modeling of Continuum Robots: Integration of Lie Group Kinematics and Evolutionary Algorithms
Continuum robots, known for their high flexibility and adaptability, offer immense potential for applications such as medical surgery, confined-space inspections, and wearable devices. However, their non-linear elastic properties and complex kinematics present significant challenges in digital modeling and effective control. This research proposes a novel computational framework that integrates Lie group kinematics with an evolutionary algorithm (EA) to identify optimal control coefficients for specific robot models. Our method starts by generating datasets from physics-based simulations and fractional order control, defining both ideal configurations and models to be optimized. By using EA, we iteratively minimize deviations through two fitness objectives \textemdash deviation mean squared error (\(\text{MSE}_1\)) and TCP vector error (\(\text{MSE}_2\)) \textemdash to align the robot's backbone with the desired configuration. Built on the Computer-Aided Design (CAD) platform Grasshopper, this framework provides real-time visualization, enabling dynamic control of robot configurations. Results show that the proposed method achieves precise alignment of the robot's backbone with minimal computation. This approach not only simplifies the coefficient identification process but also demonstrates the advantages of EA in multi-objective optimization, contributing to efficient modeling and control of continuum robots.
comment: 10 pages, 20 figures
Optimizing Collaborative Robotics since Pre-Deployment via Cyber-Physical Systems' Digital Twins
The collaboration between humans and robots re-quires a paradigm shift not only in robot perception, reasoning, and action, but also in the design of the robotic cell. This paper proposes an optimization framework for designing collaborative robotics cells using a digital twin during the pre-deployment phase. This approach mitigates the limitations of experience-based sub-optimal designs by means of Bayesian optimization to find the optimal layout after a certain number of iterations. By integrating production KPIs into a black-box optimization frame-work, the digital twin supports data-driven decision-making, reduces the need for costly prototypes, and ensures continuous improvement thanks to the learning nature of the algorithm. The paper presents a case study with preliminary results that show how this methodology can be applied to obtain safer, more efficient, and adaptable human-robot collaborative environments.
Error Decomposition for Hybrid Localization Systems
Future advanced driver assistance systems and autonomous vehicles rely on accurate localization, which can be divided into three classes: a) viewpoint localization about local references (e.g., via vision-based localization), b) absolute localization about a global reference system (e.g., via satellite navigation), and c) hybrid localization, which presents a combination of the former two. Hybrid localization shares characteristics and strengths of both absolute and viewpoint localization. However, new sources of error, such as inaccurate sensor-setup calibration, complement the potential errors of the respective sub-systems. Therefore, this paper introduces a general approach to analyzing error sources in hybrid localization systems. More specifically, we propose the Kappa-Phi method, which allows for the decomposition of localization errors into individual components, i.e., into a sum of parameterized functions of the measured state (e.g., agent kinematics). The error components can then be leveraged to, e.g., improve localization predictions, correct map data, or calibrate sensor setups. Theoretical derivations and evaluations show that the algorithm presents a promising approach to improve hybrid localization and counter the weaknesses of the system's individual components.
A Tactile Feedback Approach to Path Recovery after High-Speed Impacts for Collision-Resilient Drones
Aerial robots are a well-established solution for exploration, monitoring, and inspection, thanks to their superior maneuverability and agility. However, in many environments of interest, they risk crashing and sustaining damage following collisions. Traditional methods focus on avoiding obstacles entirely to prevent damage, but these approaches can be limiting, particularly in complex environments where collisions may be unavoidable, or on weight and compute-constrained platforms. This paper presents a novel approach to enhance the robustness and autonomy of drones in such scenarios by developing a path recovery and adjustment method for a high-speed collision-resistant drone equipped with binary contact sensors. The proposed system employs an estimator that explicitly models collisions, using pre-collision velocities and rates to predict post-collision dynamics, thereby improving the drone's state estimation accuracy. Additionally, we introduce a vector-field-based path representation which guarantees convergence to the path. Post-collision, the contact point is incorporated into the vector field as a repulsive potential, enabling the drone to avoid obstacles while naturally converging to the original path. The effectiveness of this method is validated through Monte Carlo simulations and demonstrated on a physical prototype, showing successful path following and adjustment through collisions as well as recovery from collisions at speeds up to 3.7 m / s.
EPIC: A Lightweight LiDAR-Based UAV Exploration Framework for Large-Scale Scenarios
Autonomous exploration is a fundamental problem for various applications of unmanned aerial vehicles (UAVs). Recently, LiDAR-based exploration has gained significant attention due to its ability to generate high-precision point cloud maps of large-scale environments. While the point clouds are inherently informative for navigation, many existing exploration methods still rely on additional, often expensive, environmental representations. This reliance stems from two main reasons: the need for frontier detection or information gain computation, which typically depends on memory-intensive occupancy grid maps, and the high computational complexity of path planning directly on point clouds, primarily due to costly collision checking. To address these limitations, we present EPIC, a lightweight LiDAR-based UAV exploration framework that directly exploits point cloud data to explore large-scale environments. EPIC introduces a novel observation map derived directly from the quality of point clouds, eliminating the need for global occupancy grid maps while preserving comprehensive exploration capabilities. We also propose an incremental topological graph construction method operating directly on point clouds, enabling real-time path planning in large-scale environments. Leveraging these components, we build a hierarchical planning framework that generates agile and energy-efficient trajectories, achieving significantly reduced memory consumption and computation time compared to most existing methods. Extensive simulations and real-world experiments demonstrate that EPIC achieves faster exploration while significantly reducing memory consumption compared to state-of-the-art methods.
A Probabilistic Model for Skill Acquisition with Switching Latent Feedback Controllers
Manipulation tasks often consist of subtasks, each representing a distinct skill. Mastering these skills is essential for robots, as it enhances their autonomy, efficiency, adaptability, and ability to work in their environment. Learning from demonstrations allows robots to rapidly acquire new skills without starting from scratch, with demonstrations typically sequencing skills to achieve tasks. Behaviour cloning approaches to learning from demonstration commonly rely on mixture density network output heads to predict robot actions. In this work, we first reinterpret the mixture density network as a library of feedback controllers (or skills) conditioned on latent states. This arises from the observation that a one-layer linear network is functionally equivalent to a classical feedback controller, with network weights corresponding to controller gains. We use this insight to derive a probabilistic graphical model that combines these elements, describing the skill acquisition process as segmentation in a latent space, where each skill policy functions as a feedback control law in this latent space. Our approach significantly improves not only task success rate, but also robustness to observation noise when trained with human demonstrations. Our physical robot experiments further show that the induced robustness improves model deployment on robots.
Learning autonomous driving from aerial imagery IROS 2024
In this work, we consider the problem of learning end to end perception to control for ground vehicles solely from aerial imagery. Photogrammetric simulators allow the synthesis of novel views through the transformation of pre-generated assets into novel views.However, they have a large setup cost, require careful collection of data and often human effort to create usable simulators. We use a Neural Radiance Field (NeRF) as an intermediate representation to synthesize novel views from the point of view of a ground vehicle. These novel viewpoints can then be used for several downstream autonomous navigation applications. In this work, we demonstrate the utility of novel view synthesis though the application of training a policy for end to end learning from images and depth data. In a traditional real to sim to real framework, the collected data would be transformed into a visual simulator which could then be used to generate novel views. In contrast, using a NeRF allows a compact representation and the ability to optimize over the parameters of the visual simulator as more data is gathered in the environment. We demonstrate the efficacy of our method in a custom built mini-city environment through the deployment of imitation policies on robotic cars. We additionally consider the task of place localization and demonstrate that our method is able to relocalize the car in the real world.
comment: Presented at IROS 2024
Optimal DLT-based Solutions for the Perspective-n-Point
We propose a modified normalized direct linear transform (DLT) algorithm for solving the perspective-n-point (PnP) problem with much better behavior than the conventional DLT. The modification consists of analytically weighting the different measurements in the linear system with a negligible increase in computational load. Our approach exhibits clear improvements -- in both performance and runtime -- when compared to popular methods such as EPnP, CPnP, RPnP, and OPnP. Our new non-iterative solution approaches that of the true optimal found via Gauss-Newton optimization, but at a fraction of the computational cost. Our optimal DLT (oDLT) implementation, as well as the experiments, are released in open source.
comment: 8 pages, 6 figures, 2 tables
Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents
When assisting people in daily tasks, robots need to accurately interpret visual cues and respond effectively in diverse safety-critical situations, such as sharp objects on the floor. In this context, we present M-CoDAL, a multimodal-dialogue system specifically designed for embodied agents to better understand and communicate in safety-critical situations. The system leverages discourse coherence relations to enhance its contextual understanding and communication abilities. To train this system, we introduce a novel clustering-based active learning mechanism that utilizes an external Large Language Model (LLM) to identify informative instances. Our approach is evaluated using a newly created multimodal dataset comprising 1K safety violations extracted from 2K Reddit images. These violations are annotated using a Large Multimodal Model (LMM) and verified by human annotators. Results with this dataset demonstrate that our approach improves resolution of safety situations, user sentiment, as well as safety of the conversation. Next, we deploy our dialogue system on a Hello Robot Stretch robot and conduct a within-subject user study with real-world participants. In the study, participants role-play two safety scenarios with different levels of severity with the robot and receive interventions from our model and a baseline system powered by OpenAI's ChatGPT. The study results corroborate and extend the findings from automated evaluation, showing that our proposed system is more persuasive and competent in a real-world embodied agent setting.
Skill Generalization with Verbs IROS 2023
It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for generalizing manipulation skills to novel objects using verbs. Our method learns a probabilistic classifier that determines whether a given object trajectory can be described by a specific verb. We show that this classifier accurately generalizes to novel object categories with an average accuracy of 76.69% across 13 object categories and 14 verbs. We then perform policy search over the object kinematics to find an object trajectory that maximizes classifier prediction for a given verb. Our method allows a robot to generate a trajectory for a novel object based on a verb, which can then be used as input to a motion planner. We show that our model can generate trajectories that are usable for executing five verb commands applied to novel instances of two different object categories on a real robot.
comment: 7 pages + 2 pages (references), 6 figures. Accepted at IROS 2023. Code, dataset info and demo videos can be found at: https://rachelma80000.github.io/SkillGenVerbs/
MarineGym: Accelerated Training for Underwater Vehicles with High-Fidelity RL Simulation ICRA
Reinforcement Learning (RL) is a promising solution, allowing Unmanned Underwater Vehicles (UUVs) to learn optimal behaviors through trial and error. However, existing simulators lack efficient integration with RL methods, limiting training scalability and performance. This paper introduces MarineGym, a novel simulation framework designed to enhance RL training efficiency for UUVs by utilizing GPU acceleration. MarineGym offers a 10,000-fold performance improvement over real-time simulation on a single GPU, enabling rapid training of RL algorithms across multiple underwater tasks. Key features include realistic dynamic modeling of UUVs, parallel environment execution, and compatibility with popular RL frameworks like PyTorch and TorchRL. The framework is validated through four distinct tasks: station-keeping, circle tracking, helical tracking, and lemniscate tracking. This framework sets the stage for advancing RL in underwater robotics and facilitating efficient training in complex, dynamic environments.
comment: Accepted by the 40th Anniversary of the IEEE Conference on Robotics and Automation (ICRA@40)
Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation
Recently, diffusion policy has shown impressive results in handling multi-modal tasks in robotic manipulation. However, it has fundamental limitations in out-of-distribution failures that persist due to compounding errors and its limited capability to extrapolate. One way to address these limitations is robot-gated DAgger, an interactive imitation learning with a robot query system to actively seek expert help during policy rollout. While robot-gated DAgger has high potential for learning at scale, existing methods like Ensemble-DAgger struggle with highly expressive policies: They often misinterpret policy disagreements as uncertainty at multi-modal decision points. To address this problem, we introduce Diff-DAgger, an efficient robot-gated DAgger algorithm that leverages the training objective of diffusion policy. We evaluate Diff-DAgger across different robot tasks including stacking, pushing, and plugging, and show that Diff-DAgger improves the task failure prediction by 37%, the task completion rate by 14%, and reduces the wall-clock time by up to 540%. We hope that this work opens up a path for efficiently incorporating expressive yet data-hungry policies into interactive robot learning settings.
Joint Verification and Refinement of Language Models for Safety-Constrained Planning
Although pre-trained language models can generate executable plans (e.g., programmatic policies) for solving robot tasks, the generated plans may violate task-relevant logical specifications due to the models' black-box nature. A significant gap remains between the language models' outputs and verifiable executions of plans. We develop a method to generate executable plans and formally verify them against task-relevant safety specifications. Given a high-level task description in natural language, the proposed method queries a language model to generate plans in the form of executable robot programs. It then converts the generated plan into an automaton-based representation, allowing formal verification of the automaton against the specifications. We prove that given a set of verified plans, the composition of these plans also satisfies the safety specifications. This proof ensures the safety of complex, multi-component plans, obviating the computation complexity of verifying the composed plan. We then propose an automated fine-tuning process that refines the language model to generate specification-compliant plans without the need for human labeling. The empirical results show a 30 percent improvement in the probability of generating plans that meet task specifications after fine-tuning.
IntelliMove: Enhancing Robotic Planning with Semantic Mapping
Semantic navigation enables robots to understand their environments beyond basic geometry, allowing them to reason about objects, their functions, and their interrelationships. In semantic robotic navigation, creating accurate and semantically enriched maps is fundamental. Planning based on semantic maps not only enhances the robot's planning efficiency and computational speed but also makes the planning more meaningful, supporting a broader range of semantic tasks. In this paper, we introduce two core modules of IntelliMove: IntelliMap, a generic hierarchical semantic topometric map framework developed through an analysis of current technologies strengths and weaknesses, and Semantic Planning, which utilizes the semantic maps from IntelliMap. We showcase use cases that highlight IntelliMove's adaptability and effectiveness. Through experiments in simulated environments, we further demonstrate IntelliMove's capability in semantic navigation.
LocoMan: Advancing Versatile Quadrupedal Dexterity with Lightweight Loco-Manipulators
Quadrupedal robots have emerged as versatile agents capable of locomoting and manipulating in complex environments. Traditional designs typically rely on the robot's inherent body parts or incorporate top-mounted arms for manipulation tasks. However, these configurations may limit the robot's operational dexterity, efficiency and adaptability, particularly in cluttered or constrained spaces. In this work, we present LocoMan, a dexterous quadrupedal robot with a novel morphology to perform versatile manipulation in diverse constrained environments. By equipping a Unitree Go1 robot with two low-cost and lightweight modular 3-DoF loco-manipulators on its front calves, LocoMan leverages the combined mobility and functionality of the legs and grippers for complex manipulation tasks that require precise 6D positioning of the end effector in a wide workspace. To harness the loco-manipulation capabilities of LocoMan, we introduce a unified control framework that extends the whole-body controller (WBC) to integrate the dynamics of loco-manipulators. Through experiments, we validate that the proposed whole-body controller can accurately and stably follow desired 6D trajectories of the end effector and torso, which, when combined with the large workspace from our design, facilitates a diverse set of challenging dexterous loco-manipulation tasks in confined spaces, such as opening doors, plugging into sockets, picking objects in narrow and low-lying spaces, and bimanual manipulation.
comment: Project page: https://linchangyi1.github.io/LocoMan
Text2Interaction: Establishing Safe and Preferable Human-Robot Interaction
Adjusting robot behavior to human preferences can require intensive human feedback, preventing quick adaptation to new users and changing circumstances. Moreover, current approaches typically treat user preferences as a reward, which requires a manual balance between task success and user satisfaction. To integrate new user preferences in a zero-shot manner, our proposed Text2Interaction framework invokes large language models to generate a task plan, motion preferences as Python code, and parameters of a safety controller. By maximizing the combined probability of task completion and user satisfaction instead of a weighted sum of rewards, we can reliably find plans that fulfill both requirements. We find that 83 % of users working with Text2Interaction agree that it integrates their preferences into the plan of the robot, and 94 % prefer Text2Interaction over the baseline. Our ablation study shows that Text2Interaction aligns better with unseen preferences than other baselines while maintaining a high success rate. Real-world demonstrations and code are made available at sites.google.com/view/text2interaction.
comment: Accepted for the Conference on Robot Learning (CoRL) 2024. Available at: https://openreview.net/forum?id=s0VNSnPeoA
Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs SP
We present a novel approach for long-term human trajectory prediction in indoor human-centric environments, which is essential for long-horizon robot planning in these environments. State-of-the-art human trajectory prediction methods are limited by their focus on collision avoidance and short-term planning, and their inability to model complex interactions of humans with the environment. In contrast, our approach overcomes these limitations by predicting sequences of human interactions with the environment and using this information to guide trajectory predictions over a horizon of up to 60s. We leverage Large Language Models (LLMs) to predict interactions with the environment by conditioning the LLM prediction on rich contextual information about the scene. This information is given as a 3D Dynamic Scene Graph that encodes the geometry, semantics, and traversability of the environment into a hierarchical representation. We then ground these interaction sequences into multi-modal spatio-temporal distributions over human positions using a probabilistic approach based on continuous-time Markov Chains. To evaluate our approach, we introduce a new semi-synthetic dataset of long-term human trajectories in complex indoor environments, which also includes annotations of human-object interactions. We show in thorough experimental evaluations that our approach achieves a 54% lower average negative log-likelihood and a 26.5% lower Best-of-20 displacement error compared to the best non-privileged (i.e., evaluated in a zero-shot fashion on the dataset) baselines for a time horizon of 60s.
comment: 8 pages, 6 figures. Accepted at IEEE Robotics and Automation Letters (RA-L). Code released at: https://github.com/MIT-SPARK/LP2
Learning Social Cost Functions for Human-Aware Path Planning
Achieving social acceptance is one of the main goals of Social Robotic Navigation. Despite this topic has received increasing interest in recent years, most of the research has focused on driving the robotic agent along obstacle-free trajectories, planning around estimates of future human motion to respect personal distances and optimize navigation. However, social interactions in everyday life are also dictated by norms that do not strictly depend on movement, such as when standing at the end of a queue rather than cutting it. In this paper, we propose a novel method to recognize common social scenarios and modify a traditional planner's cost function to adapt to them. This solution enables the robot to carry out different social navigation behaviors that would not arise otherwise, maintaining the robustness of traditional navigation. Our approach allows the robot to learn different social norms with a single learned model, rather than having different modules for each task. As a proof of concept, we consider the tasks of queuing and respect interaction spaces of groups of people talking to one another, but the method can be extended to other human activities that do not involve motion.
LED: Light Enhanced Depth Estimation at Night
Nighttime camera-based depth estimation is a highly challenging task, especially for autonomous driving applications, where accurate depth perception is essential for ensuring safe navigation. We aim to improve the reliability of perception systems at night time, where models trained on daytime data often fail in the absence of precise but costly LiDAR sensors. In this work, we introduce Light Enhanced Depth (LED), a novel cost-effective approach that significantly improves depth estimation in low-light environments by harnessing a pattern projected by high definition headlights available in modern vehicles. LED leads to significant performance boosts across multiple depth-estimation architectures (encoder-decoder, Adabins, DepthFormer) both on synthetic and real datasets. Furthermore, increased performances beyond illuminated areas reveal a holistic enhancement in scene understanding. Finally, we release the Nighttime Synthetic Drive Dataset, a new synthetic and photo-realistic nighttime dataset, which comprises 49,990 comprehensively annotated images.
comment: Preprint. Code and dataset available on the project page : https://simondemoreau.github.io/LED/
Trajectory Optimization under Contact Timing Uncertainties
Most interesting problems in robotics (e.g., locomotion and manipulation) are realized through intermittent contact with the environment. Due to the perception and modeling errors, assuming an exact time for establishing contact with the environment is unrealistic. On the other hand, handling uncertainties in contact timing is notoriously difficult as it gives rise to either handling uncertain complementarity systems or solving combinatorial optimization problems at run-time. This work presents a novel optimal control formulation to find robust control policies under contact timing uncertainties. Our main novelty lies in casting the stochastic problem to a deterministic optimization over the uncertainty set that ensures robustness criterion satisfaction of candidate pre-contact states and optimizes for contact-relevant objectives. This way, we only need to solve a manageable standard nonlinear programming problem without complementarity constraints or combinatorial explosion. Our simulation results on multiple simplified locomotion and manipulation tasks demonstrate the robustness of our uncertainty-aware formulation compared to the nominal optimal control formulation.
comment: 2024 IEEE-RAS International Conference on Humanoid Robots (Humanoids)
Context-aware Mamba-based Reinforcement Learning for social robot navigation
Social robot navigation (SRN) is a relevant problem that involves navigating a pedestrian-rich environment in a socially acceptable manner. It is an essential part of making social robots effective in pedestrian-rich settings. The use cases of such robots could vary from companion robots to warehouse robots to autonomous wheelchairs. In recent years, deep reinforcement learning has been increasingly used in research on social robot navigation. Our work introduces CAMRL (Context-Aware Mamba-based Reinforcement Learning). Mamba is a new deep learning-based State Space Model (SSM) that has achieved results comparable to transformers in sequencing tasks. CAMRL uses Mamba to determine the robot's next action, which maximizes the value of the next state predicted by the neural network, enabling the robot to navigate effectively based on the rewards assigned. We evaluate CAMRL alongside existing solutions (CADRL, LSTM-RL, SARL) using a rigorous testing dataset which involves a variety of densities and environment behaviors based on ORCA and SFM, thus, demonstrating that CAMRL achieves higher success rates, minimizes collisions, and maintains safer distances from pedestrians. This work introduces a new SRN planner, showcasing the potential for deep-state space models for robot navigation.
Mission Design for Unmanned Aerial Vehicles using Hybrid Probabilistic Logic Programs
Advanced Air Mobility (AAM) is a growing field that demands a deep understanding of legal, spatial and temporal concepts in navigation. Hence, any implementation of AAM is forced to deal with the inherent uncertainties of human-inhabited spaces. Enabling growth and innovation requires the creation of a system for safe and robust mission design, i.e., the way we formalize intentions and decide their execution as trajectories for the Unmanned Aerial Vehicle (UAV). Although legal frameworks have emerged to govern urban air spaces, their full integration into the decision process of autonomous agents and operators remains an open task. In this work we present ProMis, a system architecture for probabilistic mission design. It links the data available from various static and dynamic data sources with legal text and operator requirements by following principles of formal verification and probabilistic modeling. Hereby, ProMis enables the combination of low-level perception and high-level rules in AAM to infer validity over the UAV's state-space. To this end, we employ Hybrid Probabilistic Logic Programs (HPLP) as a unifying, intermediate representation between perception and action-taking. Furthermore, we present methods to connect ProMis with crowd-sourced map data by generating HPLP atoms that represent spatial relations in a probabilistic fashion. Our claims of the utility and generality of ProMis are supported by experiments on a diverse set of scenarios and a discussion of the computational demands associated with probabilistic missions.
Discrete time model predictive control for humanoid walking with step adjustment
This paper presents a Discrete-Time Model Predictive Controller (MPC) for humanoid walking with online footstep adjustment. The proposed controller utilizes a hierarchical control approach. The high-level controller uses a low-dimensional Linear Inverted Pendulum Model (LIPM) to determine desired foot placement and Center of Mass (CoM) motion, to prevent falls while maintaining the desired velocity. A Task Space Controller (TSC) then tracks the desired motion obtained from the high-level controller, exploiting the whole-body dynamics of the humanoid. Our approach differs from existing MPC methods for walking pattern generation by not relying on a predefined foot-plan or a reference center of pressure (CoP) trajectory. The overall approach is tested in simulation on a torque-controlled Humanoid Robot. Results show that proposed control approach generates stable walking and prevents fall against push disturbances.
comment: 6 pages, 17 figures, 1 table
Pyramid-Monozone Synergistic Grasping Policy in Dense Clutter
Grasping a diverse range of novel objects in dense clutter poses a great challenge to robotic automation mainly due to the occlusion problem. In this work, we propose the Pyramid-Monozone Synergistic Grasping Policy (PMSGP) that enables robots to effectively handle occlusions during grasping. Specifically, we initially construct the Pyramid Sequencing Policy (PSP) to sequence each object in cluttered scenes into a pyramid structure. By isolating objects layer-by-layer, the grasp detection model is allowed to focus on a single layer during each grasp. Then, we devise the Monozone Sampling Policy (MSP) to sample the grasp candidates in the top layer. Through this manner, each grasp targets the topmost object, thereby effectively avoiding most occlusions. We performed more than 7,000 real-world grasping in densely cluttered scenes with 300 novel objects, demonstrating that PMSGP significantly outperforms seven competitive grasping methods. More importantly, we tested the grasping performance of PMSGP in extremely cluttered scenes involving 100 different household goods, and found that PMSGP pushed the grasp success rate to 84.9\%. To the best of our knowledge, no previous work has demonstrated similar performance. All grasping videos are available at: https://www.youtube.com/@chenghaoli4532/playlists.
PAPL-SLAM: Principal Axis-Anchored Monocular Point-Line SLAM
In point-line SLAM systems, the utilization of line structural information and the optimization of lines are two significant problems. The former is usually addressed through structural regularities, while the latter typically involves using minimal parameter representations of lines in optimization. However, separating these two steps leads to the loss of constraint information to each other. We anchor lines with similar directions to a principal axis and optimize them with $n+2$ parameters for $n$ lines, solving both problems together. Our method considers scene structural information, which can be easily extended to different world hypotheses while significantly reducing the number of line parameters to be optimized, enabling rapid and accurate mapping and tracking. To further enhance the system's robustness and avoid mismatch, we have modeled the line-axis probabilistic data association and provided the algorithm for axis creation, updating, and optimization. Additionally, considering that most real-world scenes conform to the Atlanta World hypothesis, we provide a structural line detection strategy based on vertical priors and vanishing points. Experimental results and ablation studies on various indoor and outdoor datasets demonstrate the effectiveness of our system.
comment: 8 pages, 4 figures
EC-SLAM: Effectively Constrained Neural RGB-D SLAM with Sparse TSDF Encoding and Global Bundle Adjustment
We introduce EC-SLAM, a real-time dense RGB-D simultaneous localization and mapping (SLAM) system leveraging Neural Radiance Fields (NeRF). While recent NeRF-based SLAM systems have shown promising results, they have yet to fully exploit NeRF's potential to constrain pose optimization. EC-SLAM addresses this by using sparse parametric encodings and Truncated Signed Distance Fields (TSDF) to represent the map, enabling efficient fusion, reducing model parameters, and accelerating convergence. Our system also employs a globally constrained Bundle Adjustment (BA) strategy that capitalizes on NeRF's implicit loop closure correction capability, improving tracking accuracy by reinforcing constraints on keyframes most relevant to the current optimized frame. Furthermore, by integrating a feature-based and uniform sampling strategy that minimizes ineffective constraint points for pose optimization, we reduce the impact of random sampling in NeRF. Extensive evaluations on the Replica, ScanNet, and TUM datasets demonstrate state-of-the-art performance, with precise tracking and reconstruction accuracy achieved alongside real-time operation at up to 21 Hz.
FetchBench: A Simulation Benchmark for Robot Fetching
Fetching, which includes approaching, grasping, and retrieving, is a critical challenge for robot manipulation tasks. Existing methods primarily focus on table-top scenarios, which do not adequately capture the complexities of environments where both grasping and planning are essential. To address this gap, we propose a new benchmark FetchBench, featuring diverse procedural scenes that integrate both grasping and motion planning challenges. Additionally, FetchBench includes a data generation pipeline that collects successful fetch trajectories for use in imitation learning methods. We implement multiple baselines from the traditional sense-plan-act pipeline to end-to-end behavior models. Our empirical analysis reveals that these methods achieve a maximum success rate of only 20%, indicating substantial room for improvement. Additionally, we identify key bottlenecks within the sense-plan-act pipeline and make recommendations based on the systematic analysis.
Preference-Based Planning in Stochastic Environments: From Partially-Ordered Temporal Goals to Most Preferred Policies
Human preferences are not always represented via complete linear orders: It is natural to employ partially-ordered preferences for expressing incomparable outcomes. In this work, we consider decision-making and probabilistic planning in stochastic systems modeled as Markov decision processes (MDPs), given a partially ordered preference over a set of temporally extended goals. Specifically, each temporally extended goal is expressed using a formula in Linear Temporal Logic on Finite Traces (LTL$_f$). To plan with the partially ordered preference, we introduce order theory to map a preference over temporal goals to a preference over policies for the MDP. Accordingly, a most preferred policy under a stochastic ordering induces a stochastic nondominated probability distribution over the finite paths in the MDP. To synthesize a most preferred policy, our technical approach includes two key steps. In the first step, we develop a procedure to transform a partially ordered preference over temporal goals into a computational model, called preference automaton, which is a semi-automaton with a partial order over acceptance conditions. In the second step, we prove that finding a most preferred policy is equivalent to computing a Pareto-optimal policy in a multi-objective MDP that is constructed from the original MDP, the preference automaton, and the chosen stochastic ordering relation. Throughout the paper, we employ running examples to illustrate the proposed preference specification and solution approaches. We demonstrate the efficacy of our algorithm using these examples, providing detailed analysis, and then discuss several potential future directions.
comment: arXiv admin note: substantial text overlap with arXiv:2209.12267
A Convex Formulation of Frictional Contact for the Material Point Method and Rigid Bodies
In this paper, we introduce a novel convex formulation that seamlessly integrates the Material Point Method (MPM) with articulated rigid body dynamics in frictional contact scenarios. We extend the linear corotational hyperelastic model into the realm of elastoplasticity and include an efficient return mapping algorithm. This approach is particularly effective for MPM simulations involving significant deformation and topology changes, while preserving the convexity of the optimization problem. Our method ensures global convergence, enabling the use of large simulation time steps without compromising robustness. We have validated our approach through rigorous testing and performance evaluations, highlighting its superior capabilities in managing complex simulations relevant to robotics. Compared to previous MPM-based robotic simulators, our method significantly improves the stability of contact resolution - a critical factor in robot manipulation tasks. We make our method available in the open-source robotics toolkit, Drake. The supplemental video is available at https://youtu.be/5jrQtF5D0DA
comment: The supplemental video is available at https://youtu.be/5jrQtF5D0DA
An Experimental Study of Model-based Control for Planar Handed Shearing Auxetics Robots
Parallel robots based on Handed Shearing Auxetics (HSAs) can implement complex motions using standard electric motors while maintaining the complete softness of the structure, thanks to specifically designed architected metamaterials. However, their control is especially challenging due to varying and coupled stiffness, shearing, non-affine terms in the actuation model, and underactuation. In this paper, we present a model-based control strategy for planar HSA robots enabling regulation in task space. We formulate equations of motion, show that they admit a collocated form, and design a P-satI-D feedback controller with compensation for elastic and gravitational forces. We experimentally identify and verify the proposed control strategy in closed loop.
comment: 12 pages, 10 figures
Variational Distillation of Diffusion Policies into Mixture of Experts
This work introduces Variational Diffusion Distillation (VDD), a novel method that distills denoising diffusion policies into Mixtures of Experts (MoE) through variational inference. Diffusion Models are the current state-of-the-art in generative modeling due to their exceptional ability to accurately learn and represent complex, multi-modal distributions. This ability allows Diffusion Models to replicate the inherent diversity in human behavior, making them the preferred models in behavior learning such as Learning from Human Demonstrations (LfD). However, diffusion models come with some drawbacks, including the intractability of likelihoods and long inference times due to their iterative sampling process. The inference times, in particular, pose a significant challenge to real-time applications such as robot control. In contrast, MoEs effectively address the aforementioned issues while retaining the ability to represent complex distributions but are notoriously difficult to train. VDD is the first method that distills pre-trained diffusion models into MoE models, and hence, combines the expressiveness of Diffusion Models with the benefits of Mixture Models. Specifically, VDD leverages a decompositional upper bound of the variational objective that allows the training of each expert separately, resulting in a robust optimization scheme for MoEs. VDD demonstrates across nine complex behavior learning tasks, that it is able to: i) accurately distill complex distributions learned by the diffusion model, ii) outperform existing state-of-the-art distillation methods, and iii) surpass conventional methods for training MoE.
comment: Accepted by the 38th Annual Conference on Neural Information Processing Systems,
Deep Radar Inverse Sensor Models for Dynamic Occupancy Grid Maps
To implement autonomous driving, one essential step is to model the vehicle environment based on the sensor inputs. Radars, with their well-known advantages, became a popular option to infer the occupancy state of grid cells surrounding the vehicle. To tackle data sparsity and noise of radar detections, we propose a deep learning-based Inverse Sensor Model (ISM) to learn the mapping from sparse radar detections to polar measurement grids. Improved lidar-based measurement grids are used as reference. The learned radar measurement grids, combined with radar Doppler velocity measurements, are further used to generate a Dynamic Grid Map (DGM). Experiments in real-world highway scenarios show that our approach outperforms the hand-crafted geometric ISMs. In comparison to state-of-the-art deep learning methods, our approach is the first one to learn a single-frame measurement grid in the polar scheme from radars with a limited Field Of View (FOV). The learning framework makes the learned ISM independent of the radar mounting. This enables us to flexibly use one or more radar sensors without network retraining and without requirements on 360{\deg} sensor coverage.
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization ICRA
This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a nonlinear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
comment: v1) 7 pages, 7 figures. v2) To appear at the 2024 IEEE International Conference on Robotics and Automation (ICRA) in Yokohama, Japan. v3) Fixed typos and added publication info
Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games
This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.
comment: v1) To appear at the 2024 IEEE Conference on Decision and Control (CDC) in Milan, Italy. 8 Pages, 5 Figures. v2) Fixed typo
Multiagent Systems
On the Benefits of Robot Platooning for Navigating Crowded Environments
This paper studies how groups of robots can effectively navigate through a crowd of agents. It quantifies the performance of platooning and less constrained, greedy strategies, and the extent to which these strategies disrupt the crowd agents. Three scenarios are considered: (i) passive crowds, (ii) counter-flow crowds, and (iii) perpendicular-flow crowds. Through simulations consisting of up to 200 robots, we show that for navigating passive and counter-flow crowds, the platooning strategy is less disruptive and more effective in dense crowds than the greedy strategy, whereas for navigating perpendicular-flow crowds, the greedy strategy outperforms the platooning strategy in either aspect. Moreover, we propose an adaptive strategy that can switch between platooning and greedy behavioral states, and demonstrate that it combines the strengths of both strategies in all the scenarios considered.
comment: 14 pages, 7 figures, to be published in DARS 2024
A Model Checker for Natural Strategic Ability
In the last two decades, Alternating-time Temporal Logic (ATL) has been proved to be very useful in modeling strategic reasoning for Multi-Agent Systems (MAS). However, this logic struggles to capture the bounded rationality inherent in human decision-making processes. To overcome these limitations, Natural Alternating-time Temporal Logic (NatATL) has been recently introduced. As an extension of ATL, NatATL incorporates bounded memory constraints into agents' strategies, which allows to resemble human cognitive limitations. In this paper, we present a model checker tool for NatATL specifications - both for memoryless strategies and strategies with recall - integrated into VITAMIN, an open-source model checker designed specifically for MAS verification. By embedding NatATL into VITAMIN, we transform theoretical advancements into a practical verification framework, enabling comprehensive analysis and validation of strategic reasoning in complex multi-agent environments. Our novel tool paves the way for applications in areas such as explainable AI and human-in-the-loop systems, highlighting NatATL's substantial potential.
A Survey of Multi-Agent Deep Reinforcement Learning with Communication
Communication is an effective mechanism for coordinating the behaviors of multiple agents, broadening their views of the environment, and to support their collaborations. In the field of multi-agent deep reinforcement learning (MADRL), agents can improve the overall learning performance and achieve their objectives by communication. Agents can communicate various types of messages, either to all agents or to specific agent groups, or conditioned on specific constraints. With the growing body of research work in MADRL with communication (Comm-MADRL), there is a lack of a systematic and structural approach to distinguish and classify existing Comm-MADRL approaches. In this paper, we survey recent works in the Comm-MADRL field and consider various aspects of communication that can play a role in designing and developing multi-agent reinforcement learning systems. With these aspects in mind, we propose 9 dimensions along which Comm-MADRL approaches can be analyzed, developed, and compared. By projecting existing works into the multi-dimensional space, we discover interesting trends. We also propose some novel directions for designing future Comm-MADRL systems through exploring possible combinations of the dimensions.
comment: 34 pages, 5 figures, 13 tables; published on Autonomous Agents and Multi-Agent Systems
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization ICRA
This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a nonlinear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
comment: v1) 7 pages, 7 figures. v2) To appear at the 2024 IEEE International Conference on Robotics and Automation (ICRA) in Yokohama, Japan. v3) Fixed typos and added publication info
Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games
This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.
comment: v1) To appear at the 2024 IEEE Conference on Decision and Control (CDC) in Milan, Italy. 8 Pages, 5 Figures. v2) Fixed typo
Systems and Control (CS)
IoT-Based Water Quality Monitoring System in Philippine Off-Grid Communities
Contaminated and polluted water poses significant threats to human health, necessitating vigilant monitoring of water sources for potential contamination. This paper introduces a low-cost Internet of Things (IoT)-based water quality monitoring system designed to address water quality challenges in rural communities, as demonstrated through a case study conducted in the Philippines. The system consists of two core components. The hardware component of the system, built on Arduino technology and featuring real-time data transmission, focuses on monitoring pH levels, turbidity, and temperature via sensors. The system is equipped to transmit data to a cloud database and send informative messages to mobile numbers, updating users on the status of water supplies. The application component acts as a user interface for accessing and managing data collected by the sensors. The successful deployment of this Water Quality Monitoring (WQM) system not only helps community leaders and health workers monitor water sources but also underscores its potential to empower communities in safeguarding their water sources, thereby contributing to the advancement of clean and safe water access.
comment: Proceedings of the 2024 9th International Conference on Business and Industrial Research, May 2024, Bangkok, Thailand
Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion
Autonomous surgical robots have demonstrated significant potential to standardize surgical outcomes, driving innovations that enhance safety and consistency regardless of individual surgeon experience. Deep anterior lamellar keratoplasty (DALK), a partial thickness corneal transplant surgery aimed at replacing the anterior part of cornea above Descemet membrane (DM), would greatly benefit from an autonomous surgical approach as it highly relies on surgeon skill with high perforation rates. In this study, we proposed a novel autonomous surgical robotic system (AUTO-DALK) based on a customized neural network capable of precise needle control and consistent big bubble demarcation on cadaver and live rabbit models. We demonstrate the feasibility of an AI-based image-guided vertical drilling approach for big bubble generation, in contrast to the conventional horizontal needle approach. Our system integrates an optical coherence tomography (OCT) fiber optic distal sensor into the eye-mountable micro robotic system, which automatically segments OCT M-mode depth signals to identify corneal layers using a custom deep learning algorithm. It enables the robot to autonomously guide the needle to targeted tissue layers via a depth-controlled feedback loop. We compared autonomous needle insertion performance and resulting pneumo-dissection using AUTO-DALK against 1) freehand insertion, 2) OCT sensor guided manual insertion, and 3) teleoperated robotic insertion, reporting significant improvements in insertion depth, pneumo-dissection depth, task completion time, and big bubble formation. Ex vivo and in vivo results indicate that the AI-driven, AUTO-DALK system, is a promising solution to standardize pneumo-dissection outcomes for partial thickness keratoplasty.
Domain Adaptive Safety Filters via Deep Operator Learning
Learning-based approaches for constructing Control Barrier Functions (CBFs) are increasingly being explored for safety-critical control systems. However, these methods typically require complete retraining when applied to unseen environments, limiting their adaptability. To address this, we propose a self-supervised deep operator learning framework that learns the mapping from environmental parameters to the corresponding CBF, rather than learning the CBF directly. Our approach leverages the residual of a parametric Partial Differential Equation (PDE), where the solution defines a parametric CBF approximating the maximal control invariant set. This framework accommodates complex safety constraints, higher relative degrees, and actuation limits. We demonstrate the effectiveness of the method through numerical experiments on navigation tasks involving dynamic obstacles.
comment: 63rd IEEE Conference on Decision and Control (CDC)
Performance bounds for multi-vehicle networks with local integrators
In this work, we consider the problem of coordinating a collection of $n$th-order integrator systems. The coordination is achieved through the novel serial-consensus design, which can be seen as a method for achieving a stable closed-loop while only using local relative measurements. Earlier work has shown that second-order serial consensus can stabilize a collection of double integrators with scalable performance conditions, independent of the number of agents and topology. In this paper, we generalize these performance results to an arbitrary order $n\geq 1$. The derived performance bound depends on the condition number, measured in the vector-induced maximum matrix norm, of a general diagonalizing matrix. We provide an exact characterization of how a minimal condition number can be achieved. Third-order serial consensus is illustrated through a case study of PI-controlled vehicular formation, where the added integrators are used to mitigate the effect of unmeasured load disturbances. The theoretical results are illustrated through examples.
comment: (6 pages, 3 figures, Submitted to L-CSS and the 2025 American Control Conference)
Parametric Digital Twins for Preserving Historic Buildings: A Case Study at Löfstad Castle in Östergötland, Sweden
This study showcases the digitalization of L\"ofstad Castle in Sweden to contribute to preserving its heritage values. The castle and its collections are deteriorating due to an inappropriate indoor climate. To address this, thirteen cloud-connected sensor boxes, equipped with 84 sensors, were installed throughout the main building, from the basement to the attic, to continuously monitor various indoor environmental parameters. The collected extensive multi-parametric data form the basis for creating a parametric digital twin of the building. The digital twin and detailed data analytics offer a deeper understanding of indoor climate and guide the adoption of appropriate heating and ventilation strategies. The results revealed the need to address high humidity problems in the basement and on the ground floor, such as installing vapor barriers. Opportunities for adopting energy-efficient heating and ventilation strategies on the upper floors were also highlighted. The digitalization solution and findings are not only applicable to L\"ofstad Castle but also provide valuable guidance for the conservation of other historic buildings facing similar challenges.
comment: This work has been submitted to the IEEE for possible publication
Elements of disinformation theory: cyber engagement via increasing adversary information consumption
We consider the case where an adversary is conducting a surveillance campaign against a networked control system (NCS), and take the perspective of a defender/control system operator who has successfully isolated the cyber intruder. To better understand the adversary's intentions and to drive up their operating costs, the defender directs the adversary towards a ``honeypot" that emulates a real control system and without actual connections to a physical plant. We propose a strategy for adversary engagement within the ``honey" control system to increase the adversary's costs of information processing. We assume that, based on an understanding of the adversary's control theoretic goals, cyber threat intelligence (CTI) provides the defender knowledge of the adversary's preferences for information acquisition. We use this knowledge to spoof sensor readings to maximize the amount of information the adversary consumes while making it (information theoretically) difficult for the adversary to detect that they are being spoofed. We discuss the case of imperfect versus perfect threat intelligence and perform a numerical comparison.
comment: 8 pages, 5 figures, to appear in the Proceedings of the 2024 IEEE MILCOM Workshop on Threat Informed Defense Technologies
Wireless Human-Machine Collaboration in Industry 5.0
Wireless Human-Machine Collaboration (WHMC) represents a critical advancement for Industry 5.0, enabling seamless interaction between humans and machines across geographically distributed systems. As the WHMC systems become increasingly important for achieving complex collaborative control tasks, ensuring their stability is essential for practical deployment and long-term operation. Stability analysis certifies how the closed-loop system will behave under model randomness, which is essential for systems operating with wireless communications. However, the fundamental stability analysis of the WHMC systems remains an unexplored challenge due to the intricate interplay between the stochastic nature of wireless communications, dynamic human operations, and the inherent complexities of control system dynamics. This paper establishes a fundamental WHMC model incorporating dual wireless loops for machine and human control. Our framework accounts for practical factors such as short-packet transmissions, fading channels, and advanced HARQ schemes. We model human control lag as a Markov process, which is crucial for capturing the stochastic nature of human interactions. Building on this model, we propose a stochastic cycle-cost-based approach to derive a stability condition for the WHMC system, expressed in terms of wireless channel statistics, human dynamics, and control parameters. Our findings are validated through extensive numerical simulations and a proof-of-concept experiment, where we developed and tested a novel wireless collaborative cart-pole control system. The results confirm the effectiveness of our approach and provide a robust framework for future research on WHMC systems in more complex environments.
comment: Paper accepted by IEEE Transactions on Automatic Control
Robustness to Model Approximation, Learning, and Sample Complexity in Wasserstein Regular MDPs
We study the robustness property of discrete-time stochastic optimal control for Wasserstein model approximation under various performance criteria. Specifically, we study the performance loss when applying an optimal policy designed for an approximate model to the true dynamics compared with the optimal cost for the true model under the sup-norm-induced metric, and relate this to the Wasserstein-1 distance between the approximate and true transition kernel, under both discounted cost and average cost criteria. A primary motivation of this analysis is on empirical model estimation, where Wasserstein convergence holds under mild conditions but stronger convergence criterion, such as total variation, may not. We will discuss the application of the results to the disturbance estimation problem, where sample complexity bounds on mismatch loss are given. A further application regarding the continuity of invariant probability measures with respect to transition kernels is also discussed.
Deep Learning Based Solar Cell Recognition for Optical Wireless Power Transfer
Optical wireless power transfer (OWPT) is a technology that wirelessly transmit light energy from an optical transmitter to an optical receiver, usually a solar cell. In order to achieve the highest transmission efficiency, the solar cell receiver should be accurately aligned with the optical transmitter. Hitherto, only a few works have been existed for solar cell recognition in presence of complex backgrounds. In this paper, we employ a deep learning approach based on Yolov5-Lite for the solar cell recognition purpose, due to its lightweight, fast and easy to deploy on hardware characteristics. Our tests show a high accuracy of the employed deep learning model with the highest F1 score of 91% and mAP of 94.8%. Therefore, this deep learning model is highly promising for use in OWPT systems to precisely align optical transmitters and solar cell receivers.
comment: In Proceedings of The International Council on Electrical Engineering (ICEE) Conference 2024
Enhancing In-vehicle Multiple Object Tracking Systems with Embeddable Ising Machines
A cognitive function of tracking multiple objects, needed in autonomous mobile vehicles, comprises object detection and their temporal association. While great progress owing to machine learning has been recently seen for elaborating the similarity matrix between the objects that have been recognized and the objects detected in a current video frame, less for the assignment problem that finally determines the temporal association, which is a combinatorial optimization problem. Here we show an in-vehicle multiple object tracking system with a flexible assignment function for tracking through multiple long-term occlusion events. To solve the flexible assignment problem formulated as a nondeterministic polynomial time-hard problem, the system relies on an embeddable Ising machine based on a quantum-inspired algorithm called simulated bifurcation. Using a vehicle-mountable computing platform, we demonstrate a realtime system-wide throughput (23 frames per second on average) with the enhanced functionality.
comment: 18 pages, 7 figures, 2 tables
Grid-Forming Control of Modular Dynamic Virtual Power Plants
This article explores a flexible and coordinated control design for an aggregation of heterogeneous distributed energy resources (DERs) in a dynamic virtual power plant (DVPP). The control design aims to provide a desired aggregate grid-forming (GFM) response based on the coordination of power contributions between different DERs. Compared to existing DVPP designs with an AC-coupled AC-output configuration, a more generic modular DVPP design is proposed in this article, which comprises four types of basic DVPP modules, involving AC- or DC-coupling and AC- or DC-output, adequately accommodating diverse DER integration setups, such as AC, DC, AC/DC hybrid microgrids and renewable power plants. The control design is first developed for the four basic modules by the aggregation of DERs and the disaggregation of the control objectives, and then extended to modular DVPPs through a systematic top-down approach. The control performance is comprehensively validated through simulation. The modular DVPP design offers scalable and standardizable advanced grid interfaces (AGIs) for building and operating AC/DC hybrid power grids.
Differential Predictive Control of Residential Building HVACs for Maximizing Renewable Local Consumption and Supporting Fast Voltage Control
High penetration of distributed energy resources in distribution systems, such as rooftop solar PVs, has caused voltage fluctuations which are much faster than typical voltage control devices can react to, leading to increased operation cost and reduced equipment life. Residential buildings consume about 35% of the electricity in U.S. and are co-located with rooftop solar PV. Thus, they present an opportunity to mitigate these fluctuations locally, while benefiting both the grid and building owners. Previous works on DER-aware localized building energy management mostly focus on commercial buildings and analyzing impacts either on buildings or the grid. To fill the gaps, this paper proposes a distributed, differential predictive control scheme for residential HVAC systems for maximizing renewable local consumption. In addition, a detailed controller-building-grid co-simulation platform is developed and utilized for analyzing the potential impacts of the proposed control scheme on both the buildings and distribution system. Our studies show that the proposed method can provide benefits to both the buildings' owners and the distribution system by reducing energy draw from the grid by 12%, voltage violations and fast fluctuations by 20%, and the number of tap changes in voltage regulators by 14%.
Frequency Control and Disturbance Containment Using Grid-Forming Embedded Storage Networks
The paper discusses fast frequency control in bulk power systems using embedded networks of grid-forming energy storage resources. Differing from their traditional roles of regulating reserves, the storage resources in this work operate as fast-acting grid assets shaping transient dynamics. The storage resources in the network are autonomously controlled using local measurements for distributed frequency support during disturbance events. Further, the grid-forming inverter systems interfacing with the storage resources, are augmented with fast-acting safety controls designed to contain frequency transients within a prescribed tolerance band. The control action, derived from the storage network, improves the frequency nadirs in the system and prevents the severity of a disturbance from propagating far from the source. The paper also presents sensitivity studies to evaluate the impacts of storage capacity and inverter controller parameters on the dynamic performance of frequency control and disturbance localization. The performance of the safety-constrained grid-forming control is also compared with the more common grid-following control. The results are illustrated through case studies on an IEEE test system.
comment: accepted at the IEEE PES Electrical Energy Storage Applications and Technologies Conference (EESAT)
Coordinated Frequency Regulation in Grid-Forming Storage Network via Safety-Consensus
Inverter-based storages are poised to play a prominent role in future power grids with massive renewable generation. Grid-forming inverters (GFMs) are emerging as a dominant technology with synchronous generators (SG)-like characteristics through primary control loops. Advanced secondary control schemes, e.g., consensus algorithms, allow GFM-interfaced storage units to participate in frequency regulations and restore nominal frequency following grid disturbances. However, it is imperative to ensure transient frequency excursions do not violate critical safety limits while the grid transitions from pre- to post-disturbance operating point. This paper presents a hierarchical safety-enforced consensus method -- combining a device-layer (decentralized) transient safety filter with a secondary-layer (distributed) consensus coordination -- to achieve three distinct objectives: limiting transient frequency excursions to safe limits, minimizing frequency deviations from nominal, and ensuring coordinated power sharing among GFM-storage units. The proposed hierarchical (two-layered) safety-consensus technique is illustrated using a GFM-interfaced storage network on an IEEE 68-bus system under multiple grid transient scenarios.
comment: accepted for presentation at the IEEE Electrical Energy Storage Applications and Technologies Conference (EESAT)
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
On-device control agents, especially on mobile devices, are responsible for operating mobile devices to fulfill users' requests, enabling seamless and intuitive interactions. Integrating Multimodal Large Language Models (MLLMs) into these agents enhances their ability to understand and execute complex commands, thereby improving user experience. However, fine-tuning MLLMs for on-device control presents significant challenges due to limited data availability and inefficient online training processes. This paper introduces DistRL, a novel framework designed to enhance the efficiency of online RL fine-tuning for mobile device control agents. DistRL employs centralized training and decentralized data acquisition to ensure efficient fine-tuning in the context of dynamic online interactions. Additionally, the framework is backed by our tailor-made RL algorithm, which effectively balances exploration with the prioritized utilization of collected data to ensure stable and robust training. Our experiments show that, on average, DistRL delivers a 3X improvement in training efficiency and enables training data collection 2.4X faster than the leading synchronous multi-machine methods. Notably, after training, DistRL achieves a 20% relative improvement in success rate compared to state-of-the-art methods on general Android tasks from an open benchmark, significantly outperforming existing approaches while maintaining the same training time. These results validate DistRL as a scalable and efficient solution, offering substantial improvements in both training efficiency and agent performance for real-world, in-the-wild device control tasks.
comment: Paper and Appendix, 24 pages
On the Regret of Recursive Methods for Discrete-Time Adaptive Control with Matched Uncertainty
Continuous-time adaptive controllers for systems with a matched uncertainty often comprise an online parameter estimator and a corresponding parameterized controller to cancel the uncertainty. However, such methods are often impossible to implement directly, as they depend on an unobserved estimation error. We consider the equivalent discrete-time setting with a causal information structure, and propose a novel, online proximal point method-based adaptive controller, that under a sufficient excitation (SE) condition is asymptotically stable and achieves finite regret, scaling only with the time required to fulfill the SE. We show the same also for the widely-used recursive least squares with exponential forgetting controller under a stronger persistence of excitation condition.
comment: Accepted at the 63rd IEEE Conference on Decision and Control (CDC) 2024
Augmented Intelligence in Smart Intersections: Local Digital Twins-Assisted Hybrid Autonomous Driving
Vehicle-road collaboration is a promising approach for enhancing the safety and efficiency of autonomous driving by extending the intelligence of onboard systems to smart roadside infrastructures. The introduction of digital twins (DTs), particularly local DTs (LDTs) at the edge, in smart mobility presents a new embodiment of augmented intelligence, which could enhance information exchange and extract human driving expertise to improve onboard intelligence. This paper presents a novel LDT-assisted hybrid autonomous driving system for improving safety and efficiency in traffic intersections. By leveraging roadside units (RSUs) equipped with sensory and computing capabilities, the proposed system continuously monitors traffic, extracts human driving knowledge, and generates intersection-specific local driving agents through an offline reinforcement learning (RL) framework. When connected and automated vehicles (CAVs) pass through RSU-equipped intersections, RSUs can provide local agents to support safe and efficient driving in local areas. Meanwhile, they provide real-time cooperative perception (CP) to broaden onboard sensory horizons. The proposed LDT-assisted hybrid system is implemented with state-of-the-art products, e.g., CAVs and RSUs, and technologies, e.g., millimeter-wave (mmWave) communications. Hardware-in-the-loop (HiL) simulations and proof-of-concept (PoC) tests validate system performance from two standpoints: (i) The peak latency for CP and local agent downloading are 8.51 ms and 146 ms, respectively, aligning with 3GPP requirements for vehicle-to-everything (V2X) and model transfer use cases. Moreover, (ii) local driving agents can improve safety measures by 10% and reduce travel time by 15% compared with conventional onboard systems. The implemented prototype also demonstrates reliable real-time performance, fulfilling the targets of the proposed system design.
comment: 14 pages, 9 figures
Distributed Optimization with Finite Bit Adaptive Quantization for Efficient Communication and Precision Enhancement
In realistic distributed optimization scenarios, individual nodes possess only partial information and communicate over bandwidth constrained channels. For this reason, the development of efficient distributed algorithms is essential. In our paper we addresses the challenge of unconstrained distributed optimization. In our scenario each node's local function exhibits strong convexity with Lipschitz continuous gradients. The exchange of information between nodes occurs through $3$-bit bandwidth-limited channels (i.e., nodes exchange messages represented by a only $3$-bits). Our proposed algorithm respects the network's bandwidth constraints by leveraging zoom-in and zoom-out operations to adjust quantizer parameters dynamically. We show that during our algorithm's operation nodes are able to converge to the exact optimal solution. Furthermore, we show that our algorithm achieves a linear convergence rate to the optimal solution. We conclude the paper with simulations that highlight our algorithm's unique characteristics.
comment: arXiv admin note: text overlap with arXiv:2309.04588
Context-aware Mamba-based Reinforcement Learning for social robot navigation
Social robot navigation (SRN) is a relevant problem that involves navigating a pedestrian-rich environment in a socially acceptable manner. It is an essential part of making social robots effective in pedestrian-rich settings. The use cases of such robots could vary from companion robots to warehouse robots to autonomous wheelchairs. In recent years, deep reinforcement learning has been increasingly used in research on social robot navigation. Our work introduces CAMRL (Context-Aware Mamba-based Reinforcement Learning). Mamba is a new deep learning-based State Space Model (SSM) that has achieved results comparable to transformers in sequencing tasks. CAMRL uses Mamba to determine the robot's next action, which maximizes the value of the next state predicted by the neural network, enabling the robot to navigate effectively based on the rewards assigned. We evaluate CAMRL alongside existing solutions (CADRL, LSTM-RL, SARL) using a rigorous testing dataset which involves a variety of densities and environment behaviors based on ORCA and SFM, thus, demonstrating that CAMRL achieves higher success rates, minimizes collisions, and maintains safer distances from pedestrians. This work introduces a new SRN planner, showcasing the potential for deep-state space models for robot navigation.
On the Solution Uniqueness of Data-Driven Modeling of Flexible Loads (with Supplementary Material)
This letter first explores the solution uniqueness of the data-driven modeling of price-responsive flexible loads (PFL). The PFL on the demand side is critical in modern power systems. An accurate PFL model is fundamental for system operations. However, whether the PFL model can be uniquely and correctly identified from operational data remains unclear. To address this, we analyze the structural and practical identifiability of the PFL model, deriving the dataset condition that guarantees the solution uniqueness. Besides, we point out the practical implications of the results. Numerical tests validate this work.
Simple controller design to achieve iso-damping robustness: Non-iterative data-driven approach based on fractional-order reference model
This study proposes a simple controller design approach to achieve a class of robustness, the so-called iso-damping property. The proposed approach can be executed using only one-shot input/output data. An accurate mathematical model of a controlled plant is not required. The model-reference control problem is defined to achieve the desired closed-loop specifications, including the iso-damping, and the reference model is designed on the basis of fractional-order calculus. The optimization problem for the model-reference control is formulated using the one-shot input/output data while considering the bounded-input bounded-output (BIBO) stability from a bounded reference input to a bounded output. The iso-damping robust controller is obtained by solving the optimization problem. The representative advantages of the proposed approach over the conventional methods are the simplicity, practicality, and reliability from the viewpoint of the unnecessity of the plant model and explicit consideration of the BIBO stability from a bounded reference input to a bounded output. Numerical examples demonstrate the validity of the proposed approach.
comment: This work has been submitted to the IEEE for possible publication
Efficient pseudometrics for data-driven comparisons of nonlinear dynamical systems
Computationally efficient solutions for pseudometrics quantifying deviation from topological conjugacy between dynamical systems are presented. Deviation from conjugacy is quantified in a Pareto optimal sense that accounts for spectral properties of Koopman operators as well as trajectory geometry. Theoretical justification is provided for computing such pseudometrics in Koopman eigenfunction space rather than observable space. Furthermore, it is shown deriving the pseudometrics from unitary transformations is necessary to recover a value of zero if two systems are topologically conjugate. Therefore the pseudometrics for quantifying deviation from conjugacy are based on analytical solutions for unitary transformations in Koopman eigenfunction space. Finally, geometric considerations for the deviation from conjugac Pareto optimality problem are used to develop scalar pseudometrics that account for all possible solutions given just two Pareto points based. The approach is demonstrated on two example problems; the first being a simple benchmarking problems and the second an engineering example comparing the dynamics of morphological computation of biological nonlinear muscle actuators to simplified `man-made' or bio-inspired approaches. The benefits of considering operator based and trajectory geometry based dissimilarity measures in a unified and consistent formalism were demonstrated. Overall, the deviation from conjugacy pseudometrics provide practical advantages in terms of efficiency and scalability, while maintaining theoretical consistency.
comment: Inclusion of results to go along with theory sections; more complete version of the paper
Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games
This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.
comment: v1) To appear at the 2024 IEEE Conference on Decision and Control (CDC) in Milan, Italy. 8 Pages, 5 Figures. v2) Fixed typo
Systems and Control (EESS)
IoT-Based Water Quality Monitoring System in Philippine Off-Grid Communities
Contaminated and polluted water poses significant threats to human health, necessitating vigilant monitoring of water sources for potential contamination. This paper introduces a low-cost Internet of Things (IoT)-based water quality monitoring system designed to address water quality challenges in rural communities, as demonstrated through a case study conducted in the Philippines. The system consists of two core components. The hardware component of the system, built on Arduino technology and featuring real-time data transmission, focuses on monitoring pH levels, turbidity, and temperature via sensors. The system is equipped to transmit data to a cloud database and send informative messages to mobile numbers, updating users on the status of water supplies. The application component acts as a user interface for accessing and managing data collected by the sensors. The successful deployment of this Water Quality Monitoring (WQM) system not only helps community leaders and health workers monitor water sources but also underscores its potential to empower communities in safeguarding their water sources, thereby contributing to the advancement of clean and safe water access.
comment: Proceedings of the 2024 9th International Conference on Business and Industrial Research, May 2024, Bangkok, Thailand
Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion
Autonomous surgical robots have demonstrated significant potential to standardize surgical outcomes, driving innovations that enhance safety and consistency regardless of individual surgeon experience. Deep anterior lamellar keratoplasty (DALK), a partial thickness corneal transplant surgery aimed at replacing the anterior part of cornea above Descemet membrane (DM), would greatly benefit from an autonomous surgical approach as it highly relies on surgeon skill with high perforation rates. In this study, we proposed a novel autonomous surgical robotic system (AUTO-DALK) based on a customized neural network capable of precise needle control and consistent big bubble demarcation on cadaver and live rabbit models. We demonstrate the feasibility of an AI-based image-guided vertical drilling approach for big bubble generation, in contrast to the conventional horizontal needle approach. Our system integrates an optical coherence tomography (OCT) fiber optic distal sensor into the eye-mountable micro robotic system, which automatically segments OCT M-mode depth signals to identify corneal layers using a custom deep learning algorithm. It enables the robot to autonomously guide the needle to targeted tissue layers via a depth-controlled feedback loop. We compared autonomous needle insertion performance and resulting pneumo-dissection using AUTO-DALK against 1) freehand insertion, 2) OCT sensor guided manual insertion, and 3) teleoperated robotic insertion, reporting significant improvements in insertion depth, pneumo-dissection depth, task completion time, and big bubble formation. Ex vivo and in vivo results indicate that the AI-driven, AUTO-DALK system, is a promising solution to standardize pneumo-dissection outcomes for partial thickness keratoplasty.
Domain Adaptive Safety Filters via Deep Operator Learning
Learning-based approaches for constructing Control Barrier Functions (CBFs) are increasingly being explored for safety-critical control systems. However, these methods typically require complete retraining when applied to unseen environments, limiting their adaptability. To address this, we propose a self-supervised deep operator learning framework that learns the mapping from environmental parameters to the corresponding CBF, rather than learning the CBF directly. Our approach leverages the residual of a parametric Partial Differential Equation (PDE), where the solution defines a parametric CBF approximating the maximal control invariant set. This framework accommodates complex safety constraints, higher relative degrees, and actuation limits. We demonstrate the effectiveness of the method through numerical experiments on navigation tasks involving dynamic obstacles.
comment: 63rd IEEE Conference on Decision and Control (CDC)
Performance bounds for multi-vehicle networks with local integrators
In this work, we consider the problem of coordinating a collection of $n$th-order integrator systems. The coordination is achieved through the novel serial-consensus design, which can be seen as a method for achieving a stable closed-loop while only using local relative measurements. Earlier work has shown that second-order serial consensus can stabilize a collection of double integrators with scalable performance conditions, independent of the number of agents and topology. In this paper, we generalize these performance results to an arbitrary order $n\geq 1$. The derived performance bound depends on the condition number, measured in the vector-induced maximum matrix norm, of a general diagonalizing matrix. We provide an exact characterization of how a minimal condition number can be achieved. Third-order serial consensus is illustrated through a case study of PI-controlled vehicular formation, where the added integrators are used to mitigate the effect of unmeasured load disturbances. The theoretical results are illustrated through examples.
comment: (6 pages, 3 figures, Submitted to L-CSS and the 2025 American Control Conference)
Parametric Digital Twins for Preserving Historic Buildings: A Case Study at Löfstad Castle in Östergötland, Sweden
This study showcases the digitalization of L\"ofstad Castle in Sweden to contribute to preserving its heritage values. The castle and its collections are deteriorating due to an inappropriate indoor climate. To address this, thirteen cloud-connected sensor boxes, equipped with 84 sensors, were installed throughout the main building, from the basement to the attic, to continuously monitor various indoor environmental parameters. The collected extensive multi-parametric data form the basis for creating a parametric digital twin of the building. The digital twin and detailed data analytics offer a deeper understanding of indoor climate and guide the adoption of appropriate heating and ventilation strategies. The results revealed the need to address high humidity problems in the basement and on the ground floor, such as installing vapor barriers. Opportunities for adopting energy-efficient heating and ventilation strategies on the upper floors were also highlighted. The digitalization solution and findings are not only applicable to L\"ofstad Castle but also provide valuable guidance for the conservation of other historic buildings facing similar challenges.
comment: This work has been submitted to the IEEE for possible publication
Elements of disinformation theory: cyber engagement via increasing adversary information consumption
We consider the case where an adversary is conducting a surveillance campaign against a networked control system (NCS), and take the perspective of a defender/control system operator who has successfully isolated the cyber intruder. To better understand the adversary's intentions and to drive up their operating costs, the defender directs the adversary towards a ``honeypot" that emulates a real control system and without actual connections to a physical plant. We propose a strategy for adversary engagement within the ``honey" control system to increase the adversary's costs of information processing. We assume that, based on an understanding of the adversary's control theoretic goals, cyber threat intelligence (CTI) provides the defender knowledge of the adversary's preferences for information acquisition. We use this knowledge to spoof sensor readings to maximize the amount of information the adversary consumes while making it (information theoretically) difficult for the adversary to detect that they are being spoofed. We discuss the case of imperfect versus perfect threat intelligence and perform a numerical comparison.
comment: 8 pages, 5 figures, to appear in the Proceedings of the 2024 IEEE MILCOM Workshop on Threat Informed Defense Technologies
Wireless Human-Machine Collaboration in Industry 5.0
Wireless Human-Machine Collaboration (WHMC) represents a critical advancement for Industry 5.0, enabling seamless interaction between humans and machines across geographically distributed systems. As the WHMC systems become increasingly important for achieving complex collaborative control tasks, ensuring their stability is essential for practical deployment and long-term operation. Stability analysis certifies how the closed-loop system will behave under model randomness, which is essential for systems operating with wireless communications. However, the fundamental stability analysis of the WHMC systems remains an unexplored challenge due to the intricate interplay between the stochastic nature of wireless communications, dynamic human operations, and the inherent complexities of control system dynamics. This paper establishes a fundamental WHMC model incorporating dual wireless loops for machine and human control. Our framework accounts for practical factors such as short-packet transmissions, fading channels, and advanced HARQ schemes. We model human control lag as a Markov process, which is crucial for capturing the stochastic nature of human interactions. Building on this model, we propose a stochastic cycle-cost-based approach to derive a stability condition for the WHMC system, expressed in terms of wireless channel statistics, human dynamics, and control parameters. Our findings are validated through extensive numerical simulations and a proof-of-concept experiment, where we developed and tested a novel wireless collaborative cart-pole control system. The results confirm the effectiveness of our approach and provide a robust framework for future research on WHMC systems in more complex environments.
comment: Paper accepted by IEEE Transactions on Automatic Control
Robustness to Model Approximation, Learning, and Sample Complexity in Wasserstein Regular MDPs
We study the robustness property of discrete-time stochastic optimal control for Wasserstein model approximation under various performance criteria. Specifically, we study the performance loss when applying an optimal policy designed for an approximate model to the true dynamics compared with the optimal cost for the true model under the sup-norm-induced metric, and relate this to the Wasserstein-1 distance between the approximate and true transition kernel, under both discounted cost and average cost criteria. A primary motivation of this analysis is on empirical model estimation, where Wasserstein convergence holds under mild conditions but stronger convergence criterion, such as total variation, may not. We will discuss the application of the results to the disturbance estimation problem, where sample complexity bounds on mismatch loss are given. A further application regarding the continuity of invariant probability measures with respect to transition kernels is also discussed.
Deep Learning Based Solar Cell Recognition for Optical Wireless Power Transfer
Optical wireless power transfer (OWPT) is a technology that wirelessly transmit light energy from an optical transmitter to an optical receiver, usually a solar cell. In order to achieve the highest transmission efficiency, the solar cell receiver should be accurately aligned with the optical transmitter. Hitherto, only a few works have been existed for solar cell recognition in presence of complex backgrounds. In this paper, we employ a deep learning approach based on Yolov5-Lite for the solar cell recognition purpose, due to its lightweight, fast and easy to deploy on hardware characteristics. Our tests show a high accuracy of the employed deep learning model with the highest F1 score of 91% and mAP of 94.8%. Therefore, this deep learning model is highly promising for use in OWPT systems to precisely align optical transmitters and solar cell receivers.
comment: In Proceedings of The International Council on Electrical Engineering (ICEE) Conference 2024
Enhancing In-vehicle Multiple Object Tracking Systems with Embeddable Ising Machines
A cognitive function of tracking multiple objects, needed in autonomous mobile vehicles, comprises object detection and their temporal association. While great progress owing to machine learning has been recently seen for elaborating the similarity matrix between the objects that have been recognized and the objects detected in a current video frame, less for the assignment problem that finally determines the temporal association, which is a combinatorial optimization problem. Here we show an in-vehicle multiple object tracking system with a flexible assignment function for tracking through multiple long-term occlusion events. To solve the flexible assignment problem formulated as a nondeterministic polynomial time-hard problem, the system relies on an embeddable Ising machine based on a quantum-inspired algorithm called simulated bifurcation. Using a vehicle-mountable computing platform, we demonstrate a realtime system-wide throughput (23 frames per second on average) with the enhanced functionality.
comment: 18 pages, 7 figures, 2 tables
Grid-Forming Control of Modular Dynamic Virtual Power Plants
This article explores a flexible and coordinated control design for an aggregation of heterogeneous distributed energy resources (DERs) in a dynamic virtual power plant (DVPP). The control design aims to provide a desired aggregate grid-forming (GFM) response based on the coordination of power contributions between different DERs. Compared to existing DVPP designs with an AC-coupled AC-output configuration, a more generic modular DVPP design is proposed in this article, which comprises four types of basic DVPP modules, involving AC- or DC-coupling and AC- or DC-output, adequately accommodating diverse DER integration setups, such as AC, DC, AC/DC hybrid microgrids and renewable power plants. The control design is first developed for the four basic modules by the aggregation of DERs and the disaggregation of the control objectives, and then extended to modular DVPPs through a systematic top-down approach. The control performance is comprehensively validated through simulation. The modular DVPP design offers scalable and standardizable advanced grid interfaces (AGIs) for building and operating AC/DC hybrid power grids.
Differential Predictive Control of Residential Building HVACs for Maximizing Renewable Local Consumption and Supporting Fast Voltage Control
High penetration of distributed energy resources in distribution systems, such as rooftop solar PVs, has caused voltage fluctuations which are much faster than typical voltage control devices can react to, leading to increased operation cost and reduced equipment life. Residential buildings consume about 35% of the electricity in U.S. and are co-located with rooftop solar PV. Thus, they present an opportunity to mitigate these fluctuations locally, while benefiting both the grid and building owners. Previous works on DER-aware localized building energy management mostly focus on commercial buildings and analyzing impacts either on buildings or the grid. To fill the gaps, this paper proposes a distributed, differential predictive control scheme for residential HVAC systems for maximizing renewable local consumption. In addition, a detailed controller-building-grid co-simulation platform is developed and utilized for analyzing the potential impacts of the proposed control scheme on both the buildings and distribution system. Our studies show that the proposed method can provide benefits to both the buildings' owners and the distribution system by reducing energy draw from the grid by 12%, voltage violations and fast fluctuations by 20%, and the number of tap changes in voltage regulators by 14%.
Frequency Control and Disturbance Containment Using Grid-Forming Embedded Storage Networks
The paper discusses fast frequency control in bulk power systems using embedded networks of grid-forming energy storage resources. Differing from their traditional roles of regulating reserves, the storage resources in this work operate as fast-acting grid assets shaping transient dynamics. The storage resources in the network are autonomously controlled using local measurements for distributed frequency support during disturbance events. Further, the grid-forming inverter systems interfacing with the storage resources, are augmented with fast-acting safety controls designed to contain frequency transients within a prescribed tolerance band. The control action, derived from the storage network, improves the frequency nadirs in the system and prevents the severity of a disturbance from propagating far from the source. The paper also presents sensitivity studies to evaluate the impacts of storage capacity and inverter controller parameters on the dynamic performance of frequency control and disturbance localization. The performance of the safety-constrained grid-forming control is also compared with the more common grid-following control. The results are illustrated through case studies on an IEEE test system.
comment: accepted at the IEEE PES Electrical Energy Storage Applications and Technologies Conference (EESAT)
Coordinated Frequency Regulation in Grid-Forming Storage Network via Safety-Consensus
Inverter-based storages are poised to play a prominent role in future power grids with massive renewable generation. Grid-forming inverters (GFMs) are emerging as a dominant technology with synchronous generators (SG)-like characteristics through primary control loops. Advanced secondary control schemes, e.g., consensus algorithms, allow GFM-interfaced storage units to participate in frequency regulations and restore nominal frequency following grid disturbances. However, it is imperative to ensure transient frequency excursions do not violate critical safety limits while the grid transitions from pre- to post-disturbance operating point. This paper presents a hierarchical safety-enforced consensus method -- combining a device-layer (decentralized) transient safety filter with a secondary-layer (distributed) consensus coordination -- to achieve three distinct objectives: limiting transient frequency excursions to safe limits, minimizing frequency deviations from nominal, and ensuring coordinated power sharing among GFM-storage units. The proposed hierarchical (two-layered) safety-consensus technique is illustrated using a GFM-interfaced storage network on an IEEE 68-bus system under multiple grid transient scenarios.
comment: accepted for presentation at the IEEE Electrical Energy Storage Applications and Technologies Conference (EESAT)
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
On-device control agents, especially on mobile devices, are responsible for operating mobile devices to fulfill users' requests, enabling seamless and intuitive interactions. Integrating Multimodal Large Language Models (MLLMs) into these agents enhances their ability to understand and execute complex commands, thereby improving user experience. However, fine-tuning MLLMs for on-device control presents significant challenges due to limited data availability and inefficient online training processes. This paper introduces DistRL, a novel framework designed to enhance the efficiency of online RL fine-tuning for mobile device control agents. DistRL employs centralized training and decentralized data acquisition to ensure efficient fine-tuning in the context of dynamic online interactions. Additionally, the framework is backed by our tailor-made RL algorithm, which effectively balances exploration with the prioritized utilization of collected data to ensure stable and robust training. Our experiments show that, on average, DistRL delivers a 3X improvement in training efficiency and enables training data collection 2.4X faster than the leading synchronous multi-machine methods. Notably, after training, DistRL achieves a 20% relative improvement in success rate compared to state-of-the-art methods on general Android tasks from an open benchmark, significantly outperforming existing approaches while maintaining the same training time. These results validate DistRL as a scalable and efficient solution, offering substantial improvements in both training efficiency and agent performance for real-world, in-the-wild device control tasks.
comment: Paper and Appendix, 24 pages
On the Regret of Recursive Methods for Discrete-Time Adaptive Control with Matched Uncertainty
Continuous-time adaptive controllers for systems with a matched uncertainty often comprise an online parameter estimator and a corresponding parameterized controller to cancel the uncertainty. However, such methods are often impossible to implement directly, as they depend on an unobserved estimation error. We consider the equivalent discrete-time setting with a causal information structure, and propose a novel, online proximal point method-based adaptive controller, that under a sufficient excitation (SE) condition is asymptotically stable and achieves finite regret, scaling only with the time required to fulfill the SE. We show the same also for the widely-used recursive least squares with exponential forgetting controller under a stronger persistence of excitation condition.
comment: Accepted at the 63rd IEEE Conference on Decision and Control (CDC) 2024
Augmented Intelligence in Smart Intersections: Local Digital Twins-Assisted Hybrid Autonomous Driving
Vehicle-road collaboration is a promising approach for enhancing the safety and efficiency of autonomous driving by extending the intelligence of onboard systems to smart roadside infrastructures. The introduction of digital twins (DTs), particularly local DTs (LDTs) at the edge, in smart mobility presents a new embodiment of augmented intelligence, which could enhance information exchange and extract human driving expertise to improve onboard intelligence. This paper presents a novel LDT-assisted hybrid autonomous driving system for improving safety and efficiency in traffic intersections. By leveraging roadside units (RSUs) equipped with sensory and computing capabilities, the proposed system continuously monitors traffic, extracts human driving knowledge, and generates intersection-specific local driving agents through an offline reinforcement learning (RL) framework. When connected and automated vehicles (CAVs) pass through RSU-equipped intersections, RSUs can provide local agents to support safe and efficient driving in local areas. Meanwhile, they provide real-time cooperative perception (CP) to broaden onboard sensory horizons. The proposed LDT-assisted hybrid system is implemented with state-of-the-art products, e.g., CAVs and RSUs, and technologies, e.g., millimeter-wave (mmWave) communications. Hardware-in-the-loop (HiL) simulations and proof-of-concept (PoC) tests validate system performance from two standpoints: (i) The peak latency for CP and local agent downloading are 8.51 ms and 146 ms, respectively, aligning with 3GPP requirements for vehicle-to-everything (V2X) and model transfer use cases. Moreover, (ii) local driving agents can improve safety measures by 10% and reduce travel time by 15% compared with conventional onboard systems. The implemented prototype also demonstrates reliable real-time performance, fulfilling the targets of the proposed system design.
comment: 14 pages, 9 figures
Distributed Optimization with Finite Bit Adaptive Quantization for Efficient Communication and Precision Enhancement
In realistic distributed optimization scenarios, individual nodes possess only partial information and communicate over bandwidth constrained channels. For this reason, the development of efficient distributed algorithms is essential. In our paper we addresses the challenge of unconstrained distributed optimization. In our scenario each node's local function exhibits strong convexity with Lipschitz continuous gradients. The exchange of information between nodes occurs through $3$-bit bandwidth-limited channels (i.e., nodes exchange messages represented by a only $3$-bits). Our proposed algorithm respects the network's bandwidth constraints by leveraging zoom-in and zoom-out operations to adjust quantizer parameters dynamically. We show that during our algorithm's operation nodes are able to converge to the exact optimal solution. Furthermore, we show that our algorithm achieves a linear convergence rate to the optimal solution. We conclude the paper with simulations that highlight our algorithm's unique characteristics.
comment: arXiv admin note: text overlap with arXiv:2309.04588
Context-aware Mamba-based Reinforcement Learning for social robot navigation
Social robot navigation (SRN) is a relevant problem that involves navigating a pedestrian-rich environment in a socially acceptable manner. It is an essential part of making social robots effective in pedestrian-rich settings. The use cases of such robots could vary from companion robots to warehouse robots to autonomous wheelchairs. In recent years, deep reinforcement learning has been increasingly used in research on social robot navigation. Our work introduces CAMRL (Context-Aware Mamba-based Reinforcement Learning). Mamba is a new deep learning-based State Space Model (SSM) that has achieved results comparable to transformers in sequencing tasks. CAMRL uses Mamba to determine the robot's next action, which maximizes the value of the next state predicted by the neural network, enabling the robot to navigate effectively based on the rewards assigned. We evaluate CAMRL alongside existing solutions (CADRL, LSTM-RL, SARL) using a rigorous testing dataset which involves a variety of densities and environment behaviors based on ORCA and SFM, thus, demonstrating that CAMRL achieves higher success rates, minimizes collisions, and maintains safer distances from pedestrians. This work introduces a new SRN planner, showcasing the potential for deep-state space models for robot navigation.
On the Solution Uniqueness of Data-Driven Modeling of Flexible Loads (with Supplementary Material)
This letter first explores the solution uniqueness of the data-driven modeling of price-responsive flexible loads (PFL). The PFL on the demand side is critical in modern power systems. An accurate PFL model is fundamental for system operations. However, whether the PFL model can be uniquely and correctly identified from operational data remains unclear. To address this, we analyze the structural and practical identifiability of the PFL model, deriving the dataset condition that guarantees the solution uniqueness. Besides, we point out the practical implications of the results. Numerical tests validate this work.
Simple controller design to achieve iso-damping robustness: Non-iterative data-driven approach based on fractional-order reference model
This study proposes a simple controller design approach to achieve a class of robustness, the so-called iso-damping property. The proposed approach can be executed using only one-shot input/output data. An accurate mathematical model of a controlled plant is not required. The model-reference control problem is defined to achieve the desired closed-loop specifications, including the iso-damping, and the reference model is designed on the basis of fractional-order calculus. The optimization problem for the model-reference control is formulated using the one-shot input/output data while considering the bounded-input bounded-output (BIBO) stability from a bounded reference input to a bounded output. The iso-damping robust controller is obtained by solving the optimization problem. The representative advantages of the proposed approach over the conventional methods are the simplicity, practicality, and reliability from the viewpoint of the unnecessity of the plant model and explicit consideration of the BIBO stability from a bounded reference input to a bounded output. Numerical examples demonstrate the validity of the proposed approach.
comment: This work has been submitted to the IEEE for possible publication
Efficient pseudometrics for data-driven comparisons of nonlinear dynamical systems
Computationally efficient solutions for pseudometrics quantifying deviation from topological conjugacy between dynamical systems are presented. Deviation from conjugacy is quantified in a Pareto optimal sense that accounts for spectral properties of Koopman operators as well as trajectory geometry. Theoretical justification is provided for computing such pseudometrics in Koopman eigenfunction space rather than observable space. Furthermore, it is shown deriving the pseudometrics from unitary transformations is necessary to recover a value of zero if two systems are topologically conjugate. Therefore the pseudometrics for quantifying deviation from conjugacy are based on analytical solutions for unitary transformations in Koopman eigenfunction space. Finally, geometric considerations for the deviation from conjugac Pareto optimality problem are used to develop scalar pseudometrics that account for all possible solutions given just two Pareto points based. The approach is demonstrated on two example problems; the first being a simple benchmarking problems and the second an engineering example comparing the dynamics of morphological computation of biological nonlinear muscle actuators to simplified `man-made' or bio-inspired approaches. The benefits of considering operator based and trajectory geometry based dissimilarity measures in a unified and consistent formalism were demonstrated. Overall, the deviation from conjugacy pseudometrics provide practical advantages in terms of efficiency and scalability, while maintaining theoretical consistency.
comment: Inclusion of results to go along with theory sections; more complete version of the paper
Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games
This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.
comment: v1) To appear at the 2024 IEEE Conference on Decision and Control (CDC) in Milan, Italy. 8 Pages, 5 Figures. v2) Fixed typo
Robotics
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
3D visual grounding is crucial for robots, requiring integration of natural language and 3D scene understanding. Traditional methods depending on supervised learning with 3D point clouds are limited by scarce datasets. Recently zero-shot methods leveraging LLMs have been proposed to address the data issue. While effective, these methods only use object-centric information, limiting their ability to handle complex queries. In this work, we present VLM-Grounder, a novel framework using vision-language models (VLMs) for zero-shot 3D visual grounding based solely on 2D images. VLM-Grounder dynamically stitches image sequences, employs a grounding and feedback scheme to find the target object, and uses a multi-view ensemble projection to accurately estimate 3D bounding boxes. Experiments on ScanRefer and Nr3D datasets show VLM-Grounder outperforms previous zero-shot methods, achieving 51.6% Acc@0.25 on ScanRefer and 48.0% Acc on Nr3D, without relying on 3D geometry or object priors. Codes are available at https://github.com/OpenRobotLab/VLM-Grounder .
comment: CoRL 2024 Camera Ready. 25 pages. A novel zero-shot 3D visual grounding framework based solely on 2D images
Differentiable Robot Rendering
Vision foundation models trained on massive amounts of visual data have shown unprecedented reasoning and planning skills in open-world settings. A key challenge in applying them to robotic tasks is the modality gap between visual data and action data. We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to its control parameters. Our model integrates a kinematics-aware deformable model and Gaussians Splatting and is compatible with any robot form factors and degrees of freedom. We demonstrate its capability and usage in applications including reconstruction of robot poses from images and controlling robots through vision language models. Quantitative and qualitative results show that our differentiable rendering model provides effective gradients for robotic control directly from pixels, setting the foundation for the future applications of vision foundation models in robotics.
comment: Project Page: https://drrobot.cs.columbia.edu/
Adaptive Subsampling and Learned Model Improve Spatiotemporal Resolution of Tactile Skin
High-speed tactile arrays are essential for real-time robotic control in unstructured environments, but high pixel counts limit readout rates of most large tactile arrays to below 100Hz. We introduce ACTS - adaptive compressive tactile subsampling - a method that efficiently samples tactile matrices and reconstructs interactions using sparse recovery and a learned tactile dictionary. Tested on a 1024-pixel sensor array (32x32), ACTS increased frame rates by 18X compared to raster scanning, with minimal error. For the first time in large-area tactile skin, we demonstrate rapid object classification within 20ms of contact, high-speed projectile detection, ricochet angle estimation, and deformation tracking through enhanced spatiotemporal resolution. Our method can be implemented in firmware, upgrading existing low-cost, flexible, and robust tactile arrays into high-resolution systems for large-area spatiotemporal touch sensing.
comment: 40 pages, 8 main figures, 12 supplemental figures, Videos can be accessed at https://tinyurl.com/TactileSubsampling
ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
Reward shaping is a critical component in reinforcement learning (RL), particularly for complex tasks where sparse rewards can hinder learning. While shaping rewards have been introduced to provide additional guidance, selecting effective shaping functions remains challenging and computationally expensive. This paper introduces Online Reward Selection and Policy Optimization (ORSO), a novel approach that frames shaping reward selection as an online model selection problem. ORSO employs principled exploration strategies to automatically identify promising shaping reward functions without human intervention, balancing exploration and exploitation with provable regret guarantees. We demonstrate ORSO's effectiveness across various continuous control tasks using the Isaac Gym simulator. Compared to traditional methods that fully evaluate each shaping reward function, ORSO significantly improves sample efficiency, reduces computational time, and consistently identifies high-quality reward functions that produce policies comparable to those generated by domain experts through hand-engineered rewards.
comment: preprint, 35 pages, 23 figures
Towards a Factor Graph-Based Method using Angular Rates for Full Magnetometer Calibration and Gyroscope Bias Estimation IROS
MEMS Attitude Heading Reference Systems are widely employed to determine a system's attitude, but sensor measurement biases limit their accuracy. This paper introduces a novel factor graph-based method called MAgnetometer and GYroscope Calibration (MAGYC). MAGYC leverages three-axis angular rate measurements from an angular rate gyroscope to enhance calibration for batch and online applications. Our approach imposes less restrictive conditions for instrument movements required for calibration, eliminates the need for knowledge of the local magnetic field or instrument attitude, and facilitates integration into factor graph algorithms within Smoothing and Mapping frameworks. We evaluate the proposed methods through numerical simulations and in-field experimental assessments using a sensor installed on an underwater vehicle. Ultimately, our proposed methods reduced the underwater vehicle's heading error standard deviation from 6.21 to 0.57 degrees for a standard seafloor mapping survey.
comment: 7 pages, 4 figures, submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation
Reinforcement learning (RL) often necessitates a meticulous Markov Decision Process (MDP) design tailored to each task. This work aims to address this challenge by proposing a systematic approach to behavior synthesis and control for multi-contact loco-manipulation tasks, such as navigating spring-loaded doors and manipulating heavy dishwashers. We define a task-independent MDP to train RL policies using only a single demonstration per task generated from a model-based trajectory optimizer. Our approach incorporates an adaptive phase dynamics formulation to robustly track the demonstrations while accommodating dynamic uncertainties and external disturbances. We compare our method against prior motion imitation RL works and show that the learned policies achieve higher success rates across all considered tasks. These policies learn recovery maneuvers that are not present in the demonstration, such as re-grasping objects during execution or dealing with slippages. Finally, we successfully transfer the policies to a real robot, demonstrating the practical viability of our approach.
comment: J. P. Sleiman and M. Mittal contributed equally. Accepted for CoRL 2024 (Oral). Project website: https://leggedrobotics.github.io/guided-rl-locoma/
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Large, general-purpose robotic policies trained on diverse demonstration datasets have been shown to be remarkably effective both for controlling a variety of robots in a range of different scenes, and for acquiring broad repertoires of manipulation skills. However, the data that such policies are trained on is generally of mixed quality -- not only are human-collected demonstrations unlikely to perform the task perfectly, but the larger the dataset is, the harder it is to curate only the highest quality examples. It also remains unclear how optimal data from one embodiment is for training on another embodiment. In this paper, we present a general and broadly applicable approach that enhances the performance of such generalist robot policies at deployment time by re-ranking their actions according to a value function learned via offline RL. This approach, which we call Value-Guided Policy Steering (V-GPS), is compatible with a wide range of different generalist policies, without needing to fine-tune or even access the weights of the policy. We show that the same value function can improve the performance of five different state-of-the-art policies with different architectures, even though they were trained on distinct datasets, attaining consistent performance improvement on multiple robotic platforms across a total of 12 tasks. Code and videos can be found at: https://nakamotoo.github.io/V-GPS
comment: Conference on Robot Learning (CoRL) 2024. Project Page: https://nakamotoo.github.io/V-GPS
CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building
Intelligent and reliable task planning is a core capability for generalized robotics, requiring a descriptive domain representation that sufficiently models all object and state information for the scene. We present CLIMB, a continual learning framework for robot task planning that leverages foundation models and execution feedback to guide domain model construction. CLIMB can build a model from a natural language description, learn non-obvious predicates while solving tasks, and store that information for future problems. We demonstrate the ability of CLIMB to improve performance in common planning environments compared to baseline methods. We also develop the BlocksWorld++ domain, a simulated environment with an easily usable real counterpart, together with a curriculum of tasks with progressing difficulty for evaluating continual learning. Additional details and demonstrations for this system can be found at https://plan-with-climb.github.io/ .
comment: 6 pages, 6 figures
Interacting humans and robots can improve sensory prediction by adapting their viscoelasticity
To manipulate objects or dance together, humans and robots exchange energy and haptic information. While the exchange of energy in human-robot interaction has been extensively investigated, the underlying exchange of haptic information is not well understood. Here, we develop a computational model of the mechanical and sensory interactions between agents that can tune their viscoelasticity while considering their sensory and motor noise. The resulting stochastic-optimal-information-and-effort (SOIE) controller predicts how the exchange of haptic information and the performance can be improved by adjusting viscoelasticity. This controller was first implemented on a robot-robot experiment with a tracking task which showed its superior performance when compared to either stiff or compliant control. Importantly, the optimal controller also predicts how connected humans alter their muscle activation to improve haptic communication, with differentiated viscoelasticity adjustment to their own sensing noise and haptic perturbations. A human-robot experiment then illustrated the applicability of this optimal control strategy for robots, yielding improved tracking performance and effective haptic communication as the robot adjusted its viscoelasticity according to its own and the user's noise characteristics. The proposed SOIE controller may thus be used to improve haptic communication and collaboration of humans and robots.
Jailbreaking LLM-Controlled Robots
The recent introduction of large language models (LLMs) has revolutionized the field of robotics by enabling contextual reasoning and intuitive human-robot interaction in domains as varied as manipulation, locomotion, and self-driving vehicles. When viewed as a stand-alone technology, LLMs are known to be vulnerable to jailbreaking attacks, wherein malicious prompters elicit harmful text by bypassing LLM safety guardrails. To assess the risks of deploying LLMs in robotics, in this paper, we introduce RoboPAIR, the first algorithm designed to jailbreak LLM-controlled robots. Unlike existing, textual attacks on LLM chatbots, RoboPAIR elicits harmful physical actions from LLM-controlled robots, a phenomenon we experimentally demonstrate in three scenarios: (i) a white-box setting, wherein the attacker has full access to the NVIDIA Dolphins self-driving LLM, (ii) a gray-box setting, wherein the attacker has partial access to a Clearpath Robotics Jackal UGV robot equipped with a GPT-4o planner, and (iii) a black-box setting, wherein the attacker has only query access to the GPT-3.5-integrated Unitree Robotics Go2 robot dog. In each scenario and across three new datasets of harmful robotic actions, we demonstrate that RoboPAIR, as well as several static baselines, finds jailbreaks quickly and effectively, often achieving 100% attack success rates. Our results reveal, for the first time, that the risks of jailbroken LLMs extend far beyond text generation, given the distinct possibility that jailbroken robots could cause physical damage in the real world. Indeed, our results on the Unitree Go2 represent the first successful jailbreak of a deployed commercial robotic system. Addressing this emerging vulnerability is critical for ensuring the safe deployment of LLMs in robotics. Additional media is available at: https://robopair.org
Automatic Navigation and Voice Cloning Technology Deployment on a Humanoid Robot
Mobile robots have shown immense potential and are expected to be widely used in the service industry. The importance of automatic navigation and voice cloning cannot be overstated as they enable functional robots to provide high-quality services. The objective of this work is to develop a control algorithm for the automatic navigation of a humanoid mobile robot called Cruzr, which is a service robot manufactured by Ubtech. Initially, a virtual environment is constructed in the simulation software Gazebo using Simultaneous Localization And Mapping (SLAM), and global path planning is carried out by means of local path tracking. The two-wheel differential chassis kinematics model is employed to ensure autonomous dynamic obstacle avoidance for the robot chassis. Furthermore, the mapping and trajectory generation algorithms developed in the simulation environment are successfully implemented on the real robot Cruzr. The performance of automatic navigation is compared between the Dynamic Window Approach (DWA) and Model Predictive Control (MPC) algorithms. Additionally, a mobile application for voice cloning is created based on a Hidden Markov Model, and the proposed Chatbot is also tested and deployed on Cruzr.
comment: 7 pages, 6 figures
Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control
Diffusion models demonstrate superior performance in capturing complex distributions from large-scale datasets, providing a promising solution for quadrupedal locomotion control. However, offline policy is sensitive to Out-of-Distribution (OOD) states due to the limited state coverage in the datasets. In this work, we propose a two-stage learning framework combining offline learning and online preference alignment for legged locomotion control. Through the offline stage, the diffusion planner learns the joint distribution of state-action sequences from expert datasets without using reward labels. Subsequently, we perform the online interaction in the simulation environment based on the trained offline planer, which significantly addresses the OOD issues and improves the robustness. Specifically, we propose a novel weak preference labeling method without the ground-truth reward or human preferences. The proposed method exhibits superior stability and velocity tracking accuracy in pacing, trotting, and bounding gait under both slow- and high-speed scenarios and can perform zero-shot transfer to the real Unitree Go1 robots. The project website for this paper is at https://shangjaven.github.io/preference-aligned-diffusion-legged/.
SPF-EMPC Planner: A real-time multi-robot trajectory planner for complex environments with uncertainties
In practical applications, the unpredictable movement of obstacles and the imprecise state observation of robots introduce significant uncertainties for the swarm of robots, especially in cluster environments. However, existing methods are difficult to realize safe navigation, considering uncertainties, complex environmental structures, and robot swarms. This paper introduces an extended state model predictive control planner with a safe probability field to address the multi-robot navigation problem in complex, dynamic, and uncertain environments. Initially, the safe probability field offers an innovative approach to model the uncertainty of external dynamic obstacles, combining it with an unconstrained optimization method to generate safe trajectories for multi-robot online. Subsequently, the extended state model predictive controller can accurately track these generated trajectories while considering the robots' inherent model constraints and state uncertainty, thus ensuring the practical feasibility of the planned trajectories. Simulation experiments show a success rate four times higher than that of state-of-the-art algorithms. Physical experiments demonstrate the method's ability to operate in real-time, enabling safe navigation for multi-robot in uncertain environments.
DualQuat-LOAM: LiDAR Odometry and Mapping parametrized on Dual Quaternions
This paper reports on a novel method for LiDAR odometry estimation, which completely parameterizes the system with dual quaternions. To accomplish this, the features derived from the point cloud, including edges, surfaces, and Stable Triangle Descriptor (STD), along with the optimization problem, are expressed in the dual quaternion set. This approach enables the direct combination of translation and orientation errors via dual quaternion operations, greatly enhancing pose estimation, as demonstrated in comparative experiments against other state-of-the-art methods. Our approach reduced drift error compared to other LiDAR-only-odometry methods, especially in scenarios with sharp curves and aggressive movements with large angular displacement. DualQuat-LOAM is benchmarked against several public datasets. In the KITTI dataset it has a translation and rotation error of 0.79% and 0.0039{\deg}/m, with an average run time of 53 ms.
CERES: Critical-Event Reconstruction via Temporal Scene Graph Completion
This paper proposes a method for on-demand scenario generation in simulation, grounded on real-world data. Evaluating the behaviour of Autonomous Vehicles (AVs) in both safety-critical and regular scenarios is essential for assessing their robustness before real-world deployment. By integrating scenarios derived from real-world datasets into the simulation, we enhance the plausibility and validity of testing sets. This work introduces a novel approach that employs temporal scene graphs to capture evolving spatiotemporal relationships among scene entities from a real-world dataset, enabling the generation of dynamic scenarios in simulation through Graph Neural Networks (GNNs). User-defined action and criticality conditioning are used to ensure flexible, tailored scenario creation. Our model significantly outperforms the benchmarks in accurately predicting links corresponding to the requested scenarios. We further evaluate the validity and compatibility of our generated scenarios in an off-the-shelf simulator.
comment: 7 pages, 8 figures
State Estimation Transformers for Agile Legged Locomotion IROS 2024
We propose a state estimation method that can accurately predict the robot's privileged states to push the limits of quadruped robots in executing advanced skills such as jumping in the wild. In particular, we present the State Estimation Transformers (SET), an architecture that casts the state estimation problem as conditional sequence modeling. SET outputs the robot states that are hard to obtain directly in the real world, such as the body height and velocities, by leveraging a causally masked Transformer. By conditioning an autoregressive model on the robot's past states, our SET model can predict these privileged observations accurately even in highly dynamic locomotions. We evaluate our methods on three tasks -- running jumping, running backflipping, and running sideslipping -- on a low-cost quadruped robot, Cyberdog2. Results show that SET can outperform other methods in estimation accuracy and transferability in the simulation as well as success rates of jumping and triggering a recovery controller in the real world, suggesting the superiority of such a Transformer-based explicit state estimator in highly dynamic locomotion tasks.
comment: Accepted by IROS 2024
Novelty-based Sample Reuse for Continuous Robotics Control
In reinforcement learning, agents collect state information and rewards through environmental interactions, essential for policy refinement. This process is notably time-consuming, especially in complex robotic simulations and real-world applications. Traditional algorithms usually re-engage with the environment after processing a single batch of samples, thereby failing to fully capitalize on historical data. However, frequently observed states, with reliable value estimates, require minimal updates; in contrast, rare observed states necessitate more intensive updates for achieving accurate value estimations. To address uneven sample utilization, we propose Novelty-guided Sample Reuse (NSR). NSR provides extra updates for infrequent, novel states and skips additional updates for frequent states, maximizing sample use before interacting with the environment again. Our experiments show that NSR improves the convergence rate and success rate of algorithms without significantly increasing time consumption. Our code is publicly available at https://github.com/ppksigs/NSR-DDPG-HER.
Interactive Navigation with Adaptive Non-prehensile Mobile Manipulation
This paper introduces a framework for interactive navigation through adaptive non-prehensile mobile manipulation. A key challenge in this process is handling objects with unknown dynamics, which are difficult to infer from visual observation. To address this, we propose an adaptive dynamics model for common movable indoor objects via learned SE(2) dynamics representations. This model is integrated into Model Predictive Path Integral (MPPI) control to guide the robot's interactions. Additionally, the learned dynamics help inform decision-making when navigating around objects that cannot be manipulated.Our approach is validated in both simulation and real-world scenarios, demonstrating its ability to accurately represent object dynamics and effectively manipulate various objects. We further highlight its success in the Navigation Among Movable Objects (NAMO) task by deploying the proposed framework on a dynamically balancing mobile robot, Shmoobot. Project website: https://cmushmoobot.github.io/AdaptivePushing/.
comment: 7 pages, 8 figures
RAMPA: Robotic Augmented Reality for Machine Programming and Automation
As robotics continue to enter various sectors beyond traditional industrial applications, the need for intuitive robot training and interaction systems becomes increasingly more important. This paper introduces Robotic Augmented Reality for Machine Programming (RAMPA), a system that utilizes the capabilities of state-of-the-art and commercially available AR headsets, e.g., Meta Quest 3, to facilitate the application of Programming from Demonstration (PfD) approaches on industrial robotic arms, such as Universal Robots UR10. Our approach enables in-situ data recording, visualization, and fine-tuning of skill demonstrations directly within the user's physical environment. RAMPA addresses critical challenges of PfD, such as safety concerns, programming barriers, and the inefficiency of collecting demonstrations on the actual hardware. The performance of our system is evaluated against the traditional method of kinesthetic control in teaching three different robotic manipulation tasks and analyzed with quantitative metrics, measuring task performance and completion time, trajectory smoothness, system usability, user experience, and task load using standardized surveys. Our findings indicate a substantial advancement in how robotic tasks are taught and refined, promising improvements in operational safety, efficiency, and user engagement in robotic programming.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
BestMan: A Modular Mobile Manipulator Platform for Embodied AI with Unified Simulation-Hardware APIs
Embodied Artificial Intelligence (Embodied AI) emphasizes agents' ability to perceive, understand, and act in physical environments. Simulation platforms play a crucial role in advancing this field by enabling the validation and optimization of algorithms. However, existing platforms face challenges such as multilevel technical integration complexity, insufficient modularity, interface heterogeneity, and adaptation to diverse hardware. We present BestMan, a simulation platform based on PyBullet, designed to address these issues. BestMan introduces an integrated multilevel skill chain for seamless coordination across perception, planning, and control; a highly modular architecture for flexible algorithm integration; unified interfaces for smooth simulation-to-reality transfer; and a hardware-agnostic approach for adapting to various mobile manipulator configurations. These features collectively simplify development and enhance platform expandability, making BestMan a valuable tool for Embodied AI research.
Arc-Length-Based Warping for Robot Skill Synthesis from Multiple Demonstrations
In robotics, Learning from Demonstration (LfD) aims to transfer skills to robots by using multiple demonstrations of the same task. These demonstrations are recorded and processed to extract a consistent skill representation. This process typically requires temporal alignment through techniques such as Dynamic Time Warping (DTW). In this paper, we introduce a novel algorithm, named Spatial Sampling (SS), specifically designed for robot trajectories, that enables time-independent alignment of the trajectories by providing an arc-length parametrization of the signals. This approach eliminates the need for temporal alignment, enhancing the accuracy and robustness of skill representation. Specifically, we show that large time shifts in the demonstrated trajectories can introduce uncertainties in the synthesis of the final trajectory, which alignment in the arc-length domain can drastically reduce, in comparison with various state-of-the-art time-based signal alignment algorithms. To this end, we built a custom publicly available dataset of robot recordings to test real-world trajectories.
comment: 8 pages, 8 figures
TRLO: An Efficient LiDAR Odometry with 3D Dynamic Object Tracking and Removal
Simultaneous state estimation and mapping is an essential capability for mobile robots working in dynamic urban environment. The majority of existing SLAM solutions heavily rely on a primarily static assumption. However, due to the presence of moving vehicles and pedestrians, this assumption does not always hold, leading to localization accuracy decreased and maps distorted. To address this challenge, we propose TRLO, a dynamic LiDAR odometry that efficiently improves the accuracy of state estimation and generates a cleaner point cloud map. To efficiently detect dynamic objects in the surrounding environment, a deep learning-based method is applied, generating detection bounding boxes. We then design a 3D multi-object tracker based on Unscented Kalman Filter (UKF) and nearest neighbor (NN) strategy to reliably identify and remove dynamic objects. Subsequently, a fast two-stage iterative nearest point solver is employed to solve the state estimation using cleaned static point cloud. Note that a novel hash-based keyframe database management is proposed for fast access to search keyframes. Furthermore, all the detected object bounding boxes are leveraged to impose posture consistency constraint to further refine the final state estimation. Extensive evaluations and ablation studies conducted on the KITTI and UrbanLoco datasets demonstrate that our approach not only achieves more accurate state estimation but also generates cleaner maps, compared with baselines.
comment: 8pages, 5figures
Power in Numbers: Primitive Algorithm for Swarm Robot Navigation in Unknown Environments
Recently, the navigation of mobile robots in unknown environments has become a particularly significant research topic. Previous studies have primarily employed real-time environmental mapping using cameras and LiDAR, along with self-localization and path generation based on those maps. Additionally, there is research on Sim-to-Real transfer, where robots acquire behaviors through pre-trained reinforcement learning and apply these learned actions in real-world navigation. However, strictly the observe action and modelling of unknown environments that change unpredictably over time with accuracy and precision is an extremely complex endeavor. This study proposes a simple navigation algorithm for traversing unknown environments by utilizes the number of swarm robots. The proposed algorithm assumes that the robot has only the simple function of sensing the direction of the goal and the relative positions of the surrounding robots. The robots can navigate an unknown environment by simply continuing towards the goal while bypassing surrounding robots. The method does not need to sense the environment, determine whether they or other robots are stuck, or do the complicated inter-robot communication. We mathematically validate the proposed navigation algorithm, present numerical simulations based on the potential field method, and conduct experimental demonstrations using developed robots based on the sound fields for navigation.
comment: 11 pages, 22 figures
ALOHA Unleashed: A Simple Recipe for Robot Dexterity
Recent work has shown promising results for learning end-to-end robot policies using imitation learning. In this work we address the question of how far can we push imitation learning for challenging dexterous manipulation tasks. We show that a simple recipe of large scale data collection on the ALOHA 2 platform, combined with expressive models such as Diffusion Policies, can be effective in learning challenging bimanual manipulation tasks involving deformable objects and complex contact rich dynamics. We demonstrate our recipe on 5 challenging real-world and 3 simulated tasks and demonstrate improved performance over state-of-the-art baselines. The project website and videos can be found at aloha-unleashed.github.io.
Just Add Force for Contact-Rich Robot Policies
Robot trajectories used for learning end-to-end robot policies typically contain end-effector and gripper position, workspace images, and language. Policies learned from such trajectories are unsuitable for delicate grasping, which require tightly coupled and precise gripper force and gripper position. We collect and make publically available 130 trajectories with force feedback of successful grasps on 30 unique objects. Our current-based method for sensing force, albeit noisy, is gripper-agnostic and requires no additional hardware. We train and evaluate two diffusion policies: one with (forceful) the collected force feedback and one without (position-only). We find that forceful policies are superior to position-only policies for delicate grasping and are able to generalize to unseen delicate objects, while reducing grasp policy latency by near 4x, relative to LLM-based methods. With our promising results on limited data, we hope to signal to others to consider investing in collecting force and other such tactile information in new datasets, enabling more robust, contact-rich manipulation in future robot foundation models. Our data, code, models, and videos are viewable at https://justaddforce.github.io/.
Self Supervised Deep Learning for Robot Grasping
Learning Based Robot Grasping currently involves the use of labeled data. This approach has two major disadvantages. Firstly, labeling data for grasp points and angles is a strenuous process, so the dataset remains limited. Secondly, human labeling is prone to bias due to semantics. In order to solve these problems we propose a simpler self-supervised robotic setup, that will train a Convolutional Neural Network (CNN). The robot will label and collect the data during the training process. The idea is to make a robot that is less costly, small and easily maintainable in a lab setup. The robot will be trained on a large data set for several hundred hours and then the trained Neural Network can be mapped onto a larger grasping robot.
Latent Weight Diffusion: Generating Policies from Trajectories
With the increasing availability of open-source robotic data, imitation learning has emerged as a viable approach for both robot manipulation and locomotion. Currently, large generalized policies are trained to predict controls or trajectories using diffusion models, which have the desirable property of learning multimodal action distributions. However, generalizability comes with a cost - namely, larger model size and slower inference. Further, there is a known trade-off between performance and action horizon for Diffusion Policy (i.e., diffusing trajectories): fewer diffusion queries accumulate greater trajectory tracking errors. Thus, it is common practice to run these models at high inference frequency, subject to robot computational constraints. To address these limitations, we propose Latent Weight Diffusion (LWD), a method that uses diffusion to learn a distribution over policies for robotic tasks, rather than over trajectories. Our approach encodes demonstration trajectories into a latent space and then decodes them into policies using a hypernetwork. We employ a diffusion denoising model within this latent space to learn its distribution. We demonstrate that LWD can reconstruct the behaviors of the original policies that generated the trajectory dataset. LWD offers the benefits of considerably smaller policy networks during inference and requires fewer diffusion model queries. When tested on the Metaworld MT10 benchmark, LWD achieves a higher success rate compared to a vanilla multi-task policy, while using models up to ~18x smaller during inference. Additionally, since LWD generates closed-loop policies, we show that it outperforms Diffusion Policy in long action horizon settings, with reduced diffusion queries during rollout.
Vision-Language-Action Model and Diffusion Policy Switching Enables Dexterous Control of an Anthropomorphic Hand
To advance autonomous dexterous manipulation, we propose a hybrid control method that combines the relative advantages of a fine-tuned Vision-Language-Action (VLA) model and diffusion models. The VLA model provides language commanded high-level planning, which is highly generalizable, while the diffusion model handles low-level interactions which offers the precision and robustness required for specific objects and environments. By incorporating a switching signal into the training-data, we enable event based transitions between these two models for a pick-and-place task where the target object and placement location is commanded through language. This approach is deployed on our anthropomorphic ADAPT Hand 2, a 13DoF robotic hand, which incorporates compliance through series elastic actuation allowing for resilience for any interactions: showing the first use of a multi-fingered hand controlled with a VLA model. We demonstrate this model switching approach results in a over 80\% success rate compared to under 40\% when only using a VLA model, enabled by accurate near-object arm motion by the VLA model and a multi-modal grasping motion with error recovery abilities from the diffusion model.
Whisker-Inspired Tactile Sensing: A Sim2Real Approach for Precise Underwater Contact Tracking
Aquatic mammals, such as pinnipeds, utilize their whiskers to detect and discriminate objects and analyze water movements, inspiring the development of robotic whiskers for sensing contacts, surfaces, and water flows. We present the design and application of underwater whisker sensors based on Fiber Bragg Grating (FBG) technology. These passive whiskers are mounted along the robot$'$s exterior to sense its surroundings through light, non-intrusive contacts. For contact tracking, we employ a sim-to-real learning framework, which involves extensive data collection in simulation followed by a sim-to-real calibration process to transfer the model trained in simulation to the real world. Experiments with whiskers immersed in water indicate that our approach can track contact points with an accuracy of $<2$ mm, without requiring precise robot proprioception. We demonstrate that the approach also generalizes to unseen objects.
RecoveryChaining: Learning Local Recovery Policies for Robust Manipulation
Model-based planners and controllers are commonly used to solve complex manipulation problems as they can efficiently optimize diverse objectives and generalize to long horizon tasks. However, they are limited by the fidelity of their model which oftentimes leads to failures during deployment. To enable a robot to recover from such failures, we propose to use hierarchical reinforcement learning to learn a separate recovery policy. The recovery policy is triggered when a failure is detected based on sensory observations and seeks to take the robot to a state from which it can complete the task using the nominal model-based controllers. Our approach, called RecoveryChaining, uses a hybrid action space, where the model-based controllers are provided as additional \emph{nominal} options which allows the recovery policy to decide how to recover, when to switch to a nominal controller and which controller to switch to even with \emph{sparse rewards}. We evaluate our approach in three multi-step manipulation tasks with sparse rewards, where it learns significantly more robust recovery policies than those learned by baselines. Finally, we successfully transfer recovery policies learned in simulation to a physical robot to demonstrate the feasibility of sim-to-real transfer with our method.
comment: 8 pages, 9 figures
MarineFormer: A Transformer-based Navigation Policy Model for Collision Avoidance in Marine Environment
In this work, we investigate the problem of Unmanned Surface Vehicle (USV) navigation in a dense marine environment with a high-intensity current flow. The complexities arising from static and dynamic obstacles and the disturbance forces caused by current flow render existing navigation protocols inadequate for ensuring safety and avoiding collisions at sea. To learn a safe and efficient robot policy, we propose a novel methodology that leverages attention mechanisms to capture heterogeneous interactions of the agents with the static and moving obstacles and the flow disturbances from the environment in space and time. In particular, we refine a temporal function with MarineFormer, a Transformer navigation policy for spatially variable Marine environment, trained end-to-end with reinforcement learning (RL). MarineFormer uses foundational spatio-temporal graph attention with transformer architecture to process spatial attention and temporal sequences in an environment that simulates a 2D turbulent marine condition. We propose architectural modifications that improve the stability and learning speed of the recurrent models. The flow velocity estimation, which can be derived from flow simulations or sensors, is incorporated into a model-free RL framework to prevent the robot from entering into high-intensity current flow regions including intense vortices, while potentially leveraging the flow to assist in transportation. The investigated 2D marine environment encompasses flow singularities, including vortices, sinks, and sources, representing fundamental planar flow patterns associated with flood or maritime thunderstorms. Our proposed method is trained with a new reward model to deal with static and dynamic obstacles and disturbances from the current flow.
Goal Inference from Open-Ended Dialog
We present an online method for embodied agents to learn and accomplish diverse user goals. While offline methods like RLHF can represent various goals but require large datasets, our approach achieves similar flexibility with online efficiency. We extract natural language goal representations from conversations with Large Language Models (LLMs). We prompt an LLM to role play as a human with different goals and use the corresponding likelihoods to run Bayesian inference over potential goals. As a result, our method can represent uncertainty over complex goals based on unrestricted dialog. We evaluate our method in grocery shopping and home robot assistance domains using a text-based interface and AI2Thor simulation respectively. Results show our method outperforms ablation baselines that lack either explicit goal representation or probabilistic inference.
comment: 6 pages + 2 page (references and appendix)
RAMPA: Robotic Augmented Reality for Machine Programming and Automation
As robotics continue to enter various sectors beyond traditional industrial applications, the need for intuitive robot training and interaction systems becomes increasingly more important. This paper introduces Robotic Augmented Reality for Machine Programming (RAMPA), a system that utilizes the capabilities of state-of-the-art and commercially available AR headsets, e.g., Meta Quest 3, to facilitate the application of Programming from Demonstration (PfD) approaches on industrial robotic arms, such as Universal Robots UR10. Our approach enables in-situ data recording, visualization, and fine-tuning of skill demonstrations directly within the user's physical environment. RAMPA addresses critical challenges of PfD, such as safety concerns, programming barriers, and the inefficiency of collecting demonstrations on the actual hardware. The performance of our system is evaluated against the traditional method of kinesthetic control in teaching three different robotic manipulation tasks and analyzed with quantitative metrics, measuring task performance and completion time, trajectory smoothness, system usability, user experience, and task load using standardized surveys. Our findings indicate a substantial advancement in how robotic tasks are taught and refined, promising improvements in operational safety, efficiency, and user engagement in robotic programming.
comment: This work has been submitted to the IEEE for possible publication
A Data-driven Contact Estimation Method for Wheeled-Biped Robots
Contact estimation is a key ability for limbed robots, where making and breaking contacts has a direct impact on state estimation and balance control. Existing approaches typically rely on gate-cycle priors or designated contact sensors. We design a contact estimator that is suitable for the emerging wheeled-biped robot types that do not have these features. To this end, we propose a Bayes filter in which update steps are learned from real-robot torque measurements while prediction steps rely on inertial measurements. We evaluate this approach in extensive real-robot and simulation experiments. Our method achieves better performance while being considerably more sample efficient than a comparable deep-learning baseline.
Self-Supervised Learning For Robust Robotic Grasping In Dynamic Environment
Some of the threats in the dynamic environment include the unpredictability of the motion of objects and interferences to the robotic grasp. In such conditions the traditional supervised and reinforcement learning approaches are ill suited because they rely on a large amount of labelled data and a predefined reward signal. More specifically in this paper we introduce an important and promising framework known as self supervised learning (SSL) whose goal is to apply to the RGBD sensor and proprioceptive data from robot hands in order to allow robots to learn and improve their grasping strategies in real time. The invariant SSL framework overcomes the deficiencies of the fixed labelling by adapting the SSL system to changes in the objects behavior and improving performance in dynamic situations. The above proposed method was tested through various simulations and real world trials, with the series obtaining enhanced grasp success rates of 15% over other existing methods, especially under dynamic scenarios. Also, having tested for adaptation times, it was confirmed that the system could adapt faster, thus applicable for use in the real world, such as in industrial automation and service robotics. In future work, the proposed approach will be expanded to more complex tasks, such as multi object manipulation and functions in the context of cluttered environments, in order to apply the proposed methodology to a broader range of robotic tasks.
comment: This work is submitted to IEEE journals and conferences and copyright may be transferred to IEEE
3D Guidance Law for Flexible Target Enclosing with Inherent Safety
In this paper, we address the problem of enclosing an arbitrarily moving target in three dimensions by a single pursuer while ensuring the pursuer's safety by preventing collisions with the target. The proposed guidance strategy steers the pursuer to a safe region of space surrounding and excluding the target, allowing it to maintain a certain distance from the latter while offering greater flexibility in positioning and converging to any orbit within this safe zone. We leverage the concept of the Lyapunov Barrier Function as a powerful tool to constrain the distance between the pursuer and the target within asymmetric bounds, thereby ensuring the pursuer's safety within the predefined region. Further, we demonstrate the effectiveness of the proposed guidance law in managing arbitrarily maneuvering targets and other uncertainties (such as vehicle/autopilot dynamics and external disturbances) by enabling the pursuer to consistently achieve stable global enclosing behaviors by switching between stable enclosing trajectories within the safe region whenever necessary, even in response to aggressive target maneuvers. To attest to the merits of our work, we conduct experimental tests with various plant models, including a high-fidelity quadrotor model within Software-in-the-loop (SITL) simulations, encompassing various challenging target maneuver scenarios and requiring only relative information for successful execution.
comment: Supplementary video at https://youtu.be/UU704o_966s
MASQ: Multi-Agent Reinforcement Learning for Single Quadruped Robot Locomotion
This paper proposes a novel method to improve locomotion learning for a single quadruped robot using multi-agent deep reinforcement learning (MARL). Many existing methods use single-agent reinforcement learning for an individual robot or MARL for the cooperative task in multi-robot systems. Unlike existing methods, this paper proposes using MARL for the locomotion learning of a single quadruped robot. We develop a learning structure called Multi-Agent Reinforcement Learning for Single Quadruped Robot Locomotion (MASQ), considering each leg as an agent to explore the action space of the quadruped robot, sharing a global critic, and learning collaboratively. Experimental results indicate that MASQ not only speeds up learning convergence but also enhances robustness in real-world settings, suggesting that applying MASQ to single robots such as quadrupeds could surpass traditional single-robot reinforcement learning approaches. Our study provides insightful guidance on integrating MARL with single-robot locomotion learning.
Motion Accuracy and Computational Effort in QP-based Robot Control
Quadratic Programs (QPs) have become a mature technology for the control of robots of all kinds, including humanoid robots. One aspect has been largely overlooked, however, which is the accuracy with which these QPs should be solved. QP solvers aim at providing solutions accurate up to floating point precision ($\approx10^{-8}$). Considering physical quantities expressed in SI or similar units (meters, radians, etc.), such precision seems completely unrelated to both task requirements and hardware capacity. Typically, humanoid robots never achieve, nor are capable of achieving sub-millimeter precision in manipulation tasks. With this observation in mind, our objectives in this paper are two-fold: first examine how the QP solution accuracy impacts the resulting robot motion accuracy, then evaluate how a reduced solution accuracy requirement can be leveraged to reduce the corresponding computational effort. Experiments with a dynamic simulation of RHPS-1 humanoid robot indicate that computational effort can be divided by more than 27 while maintaining the desired motion accuracy.
comment: Submitted to 2024 IEEE-RAS International Conference on Humanoid Robots (Humanoids)
t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Given the wide adoption of multimodal sensors (e.g., camera, lidar, radar) by autonomous vehicles (AVs), deep analytics to fuse their outputs for a robust perception become imperative. However, existing fusion methods often make two assumptions rarely holding in practice: i) similar data distributions for all inputs and ii) constant availability for all sensors. Because, for example, lidars have various resolutions and failures of radars may occur, such variability often results in significant performance degradation in fusion. To this end, we present tREADi, an adaptive inference system that accommodates the variability of multimodal sensory data and thus enables robust and efficient perception. t-READi identifies variation-sensitive yet structure-specific model parameters; it then adapts only these parameters while keeping the rest intact. t-READi also leverages a cross-modality contrastive learning method to compensate for the loss from missing modalities. Both functions are implemented to maintain compatibility with existing multimodal deep fusion methods. The extensive experiments evidently demonstrate that compared with the status quo approaches, t-READi not only improves the average inference accuracy by more than 6% but also reduces the inference latency by almost 15x with the cost of only 5% extra memory overhead in the worst case under realistic data and modal variations.
comment: 14 pages, 16 figures
Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation
Enabling mobile robots to perform long-term tasks in dynamic real-world environments is a formidable challenge, especially when the environment changes frequently due to human-robot interactions or the robot's own actions. Traditional methods typically assume static scenes, which limits their applicability in the continuously changing real world. To overcome these limitations, we present DovSG, a novel mobile manipulation framework that leverages dynamic open-vocabulary 3D scene graphs and a language-guided task planning module for long-term task execution. DovSG takes RGB-D sequences as input and utilizes vision-language models (VLMs) for object detection to obtain high-level object semantic features. Based on the segmented objects, a structured 3D scene graph is generated for low-level spatial relationships. Furthermore, an efficient mechanism for locally updating the scene graph, allows the robot to adjust parts of the graph dynamically during interactions without the need for full scene reconstruction. This mechanism is particularly valuable in dynamic environments, enabling the robot to continually adapt to scene changes and effectively support the execution of long-term tasks. We validated our system in real-world environments with varying degrees of manual modifications, demonstrating its effectiveness and superior performance in long-term tasks. Our project page is available at: https://BJHYZJ.github.io/DoviSG.
comment: 8 pages, 5 figures
Music to Dance as Language Translation using Sequence Models
Synthesising appropriate choreographies from music remains an open problem. We introduce MDLT, a novel approach that frames the choreography generation problem as a translation task. Our method leverages an existing data set to learn to translate sequences of audio into corresponding dance poses. We present two variants of MDLT: one utilising the Transformer architecture and the other employing the Mamba architecture. We train our method on AIST++ and PhantomDance data sets to teach a robotic arm to dance, but our method can be applied to a full humanoid robot. Evaluation metrics, including Average Joint Error and Fr\'echet Inception Distance, consistently demonstrate that, when given a piece of music, MDLT excels at producing realistic and high-quality choreography. The code can be found at github.com/meowatthemoon/MDLT.
CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics NeurIPS 2024
Recent years have seen significant advancements in humanoid control, largely due to the availability of large-scale motion capture data and the application of reinforcement learning methodologies. However, many real-world tasks, such as moving large and heavy furniture, require multi-character collaboration. Given the scarcity of data on multi-character collaboration and the efficiency challenges associated with multi-agent learning, these tasks cannot be straightforwardly addressed using training paradigms designed for single-agent scenarios. In this paper, we introduce Cooperative Human-Object Interaction (CooHOI), a novel framework that addresses multi-character objects transporting through a two-phase learning paradigm: individual skill acquisition and subsequent transfer. Initially, a single agent learns to perform tasks using the Adversarial Motion Priors (AMP) framework. Following this, the agent learns to collaborate with others by considering the shared dynamics of the manipulated object during parallel training using Multi Agent Proximal Policy Optimization (MAPPO). When one agent interacts with the object, resulting in specific object dynamics changes, the other agents learn to respond appropriately, thereby achieving implicit communication and coordination between teammates. Unlike previous approaches that relied on tracking-based methods for multi-character HOI, CooHOI is inherently efficient, does not depend on motion capture data of multi-character interactions, and can be seamlessly extended to include more participants and a wide range of object types.
comment: Project website: https://gao-jiawei.com/Research/CooHOI/. NeurIPS 2024 Spotlight
Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity
We present an embodied AI system which receives open-ended natural language instructions from a human, and controls two arms to collaboratively accomplish potentially long-horizon tasks over a large workspace. Our system is modular: it deploys state of the art Large Language Models for task planning,Vision-Language models for semantic perception, and Point Cloud transformers for grasping. With semantic and physical safety in mind, these modules are interfaced with a real-time trajectory optimizer and a compliant tracking controller to enable human-robot proximity. We demonstrate performance for the following tasks: bi-arm sorting, bottle opening, and trash disposal tasks. These are done zero-shot where the models used have not been trained with any real world data from this bi-arm robot, scenes or workspace. Composing both learning- and non-learning-based components in a modular fashion with interpretable inputs and outputs allows the user to easily debug points of failures and fragilities. One may also in-place swap modules to improve the robustness of the overall platform, for instance with imitation-learned policies. https://sites.google.com/corp/view/safe-robots
KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance
Online Imitation Learning struggles with the gap between extensive online exploration space and limited expert trajectories, hindering efficient exploration due to inaccurate reward estimation. Inspired by the findings from cognitive neuroscience, we hypothesize that an agent could estimate precise task-aware reward for efficient online exploration, through decomposing the target task into the objectives of "what to do" and the mechanisms of "how to do". In this work, we introduce the hybrid Key-state guided Online Imitation (KOI) learning method, which leverages the integration of semantic and motion key states as guidance for reward estimation. Initially, we utilize visual-language models to extract semantic key states from expert trajectory, indicating the objectives of "what to do". Within the intervals between semantic key states, optical flow is employed to capture motion key states to understand the mechanisms of "how to do". By integrating a thorough grasp of hybrid key states, we refine the trajectory-matching reward computation, accelerating online imitation learning with task-aware exploration. We evaluate not only the success rate of the tasks in the Meta-World and LIBERO environments, but also the trend of variance during online imitation learning, proving that our method is more sample efficient. We also conduct real-world robotic manipulation experiments to validate the efficacy of our method, demonstrating the practical applicability of our KOI method. Videos and code are available at https://gewu-lab.github.io/Keystate_Online_Imitation/.
comment: Accepted by CoRL 2024
Trust or Bust: Ensuring Trustworthiness in Autonomous Weapon Systems
The integration of Autonomous Weapon Systems (AWS) into military operations presents both significant opportunities and challenges. This paper explores the multifaceted nature of trust in AWS, emphasising the necessity of establishing reliable and transparent systems to mitigate risks associated with bias, operational failures, and accountability. Despite advancements in Artificial Intelligence (AI), the trustworthiness of these systems, especially in high-stakes military applications, remains a critical issue. Through a systematic review of existing literature, this research identifies gaps in the understanding of trust dynamics during the development and deployment phases of AWS. It advocates for a collaborative approach that includes technologists, ethicists, and military strategists to address these ongoing challenges. The findings underscore the importance of Human-Machine teaming and enhancing system intelligibility to ensure accountability and adherence to International Humanitarian Law. Ultimately, this paper aims to contribute to the ongoing discourse on the ethical implications of AWS and the imperative for trustworthy AI in defense contexts.
comment: Accepted as a workshop paper at MILCOM 2024, 8 pages
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics
We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator's behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT). Despite being trained only on language, we show that these Transformers excel at translating tokenised visual keypoint observations into action trajectories, performing on par or better than state-of-the-art imitation learning (diffusion policies) in the low-data regime on a suite of real-world, everyday tasks. Rather than operating in the language domain as is typical, KAT leverages text-based Transformers to operate in the vision and action domains to learn general patterns in demonstration data for highly efficient imitation learning, indicating promising new avenues for repurposing natural language models for embodied tasks. Videos are available at https://www.robot-learning.uk/keypoint-action-tokens.
comment: Published at Robotics: Science and Systems (RSS) 2024
Learning a Stable, Safe, Distributed Feedback Controller for a Heterogeneous Platoon of Autonomous Vehicles
Platooning of autonomous vehicles has the potential to increase safety and fuel efficiency on highways. The goal of platooning is to have each vehicle drive at a specified speed (set by the leader) while maintaining a safe distance from its neighbors. Many prior works have analyzed various controllers for platooning, most commonly linear feedback and distributed model predictive controllers. In this work, we introduce an algorithm for learning a stable, safe, distributed controller for a heterogeneous platoon. Our algorithm relies on recent developments in learning neural network stability certificates. We train a controller for autonomous platooning in simulation and evaluate its performance on hardware with a platoon of four F1Tenth vehicles. We then perform further analysis in simulation with a platoon of 100 vehicles. Experimental results demonstrate the practicality of the algorithm and the learned controller by comparing the performance of the neural network controller to linear feedback and distributed model predictive controllers.
comment: Accepted to the International Symposium of Robotics Research (ISRR) 2024
Open-Structure: Structural Benchmark Dataset for SLAM Algorithms
This paper presents Open-Structure, a novel benchmark dataset for evaluating visual odometry and SLAM methods. Compared to existing public datasets that primarily offer raw images, Open-Structure provides direct access to point and line measurements, correspondences, structural associations, and co-visibility factor graphs, which can be fed to various stages of SLAM pipelines to mitigate the impact of data preprocessing modules in ablation experiments. The dataset comprises two distinct types of sequences from the perspective of scenarios. The first type maintains reasonable observation and occlusion relationships, as these critical elements are extracted from public image-based sequences using our dataset generator. In contrast, the second type consists of carefully designed simulation sequences that enhance dataset diversity by introducing a wide range of trajectories and observations. Furthermore, a baseline is proposed using our dataset to evaluate widely used modules, including camera pose tracking, parametrization, and factor graph optimization, within SLAM systems. By evaluating these state-of-the-art algorithms across different scenarios, we discern each module's strengths and weaknesses in the context of camera tracking and optimization processes. The Open-Structure dataset and baseline system are openly accessible on website: \url{https://open-structure.github.io}, encouraging further research and development in the field of SLAM.
Multiagent Systems
Transformer Guided Coevolution: Improved Team Formation in Multiagent Adversarial Games
We consider the problem of team formation within multiagent adversarial games. We propose BERTeam, a novel algorithm that uses a transformer-based deep neural network with Masked Language Model training to select the best team of players from a trained population. We integrate this with coevolutionary deep reinforcement learning, which trains a diverse set of individual players to choose teams from. We test our algorithm in the multiagent adversarial game Marine Capture-The-Flag, and we find that BERTeam learns non-trivial team compositions that perform well against unseen opponents. For this game, we find that BERTeam outperforms MCAA, an algorithm that similarly optimizes team formation.
Rapid and Automated Alloy Design with Graph Neural Network-Powered LLM-Driven Multi-Agent Systems
A multi-agent AI model is used to automate the discovery of new metallic alloys, integrating multimodal data and external knowledge including insights from physics via atomistic simulations. Our multi-agent system features three key components: (a) a suite of LLMs responsible for tasks such as reasoning and planning, (b) a group of AI agents with distinct roles and expertise that dynamically collaborate, and (c) a newly developed graph neural network (GNN) model for rapid retrieval of key physical properties. A set of LLM-driven AI agents collaborate to automate the exploration of the vast design space of MPEAs, guided by predictions from the GNN. We focus on the NbMoTa family of body-centered cubic (bcc) alloys, modeled using an ML-based interatomic potential, and target two key properties: the Peierls barrier and solute/screw dislocation interaction energy. Our GNN model accurately predicts these atomic-scale properties, providing a faster alternative to costly brute-force calculations and reducing the computational burden on multi-agent systems for physics retrieval. This AI system revolutionizes materials discovery by reducing reliance on human expertise and overcoming the limitations of direct all-atom simulations. By synergizing the predictive power of GNNs with the dynamic collaboration of LLM-based agents, the system autonomously navigates vast alloy design spaces, identifying trends in atomic-scale material properties and predicting macro-scale mechanical strength, as demonstrated by several computational experiments. This approach accelerates the discovery of advanced alloys and holds promise for broader applications in other complex systems, marking a significant step forward in automated materials design.
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Current mobile assistants are limited by dependence on system APIs or struggle with complex user instructions and diverse interfaces due to restricted comprehension and decision-making abilities. To address these challenges, we propose MobA, a novel Mobile phone Agent powered by multimodal large language models that enhances comprehension and planning capabilities through a sophisticated two-level agent architecture. The high-level Global Agent (GA) is responsible for understanding user commands, tracking history memories, and planning tasks. The low-level Local Agent (LA) predicts detailed actions in the form of function calls, guided by sub-tasks and memory from the GA. Integrating a Reflection Module allows for efficient task completion and enables the system to handle previously unseen complex tasks. MobA demonstrates significant improvements in task execution efficiency and completion rate in real-life evaluations, underscoring the potential of MLLM-empowered mobile assistants.
comment: 27 pages, 6 figures, and 5 tables. We will release our source code in a few days
EFX Exists for Three Types of Agents
In this paper, we study the problem of finding an envy-free allocation of indivisible goods among multiple agents. EFX, which stands for envy-freeness up to any good, is a well-studied relaxation of the envy-free allocation problem and has been shown to exist for specific scenarios. For instance, EFX is known to exist when there are only three agents [Chaudhury et al, EC 2020], and for any number of agents when there are only two types of valuations [Mahara, Discret. Appl. Math 2023]. We show that EFX allocations exist for any number of agents when there are at most three types of additive valuations.
Byzantine-Resilient Output Optimization of Multiagent via Self-Triggered Hybrid Detection Approach
How to achieve precise distributed optimization despite unknown attacks, especially the Byzantine attacks, is one of the critical challenges for multiagent systems. This paper addresses a distributed resilient optimization for linear heterogeneous multi-agent systems faced with adversarial threats. We establish a framework aimed at realizing resilient optimization for continuous-time systems by incorporating a novel self-triggered hybrid detection approach. The proposed hybrid detection approach is able to identify attacks on neighbors using both error thresholds and triggering intervals, thereby optimizing the balance between effective attack detection and the reduction of excessive communication triggers. Through using an edge-based adaptive self-triggered approach, each agent can receive its neighbors' information and determine whether these information is valid. If any neighbor prove invalid, each normal agent will isolate that neighbor by disconnecting communication along that specific edge. Importantly, our adaptive algorithm guarantees the accuracy of the optimization solution even when an agent is isolated by its neighbors.
See Behind Walls in Real-time Using Aerial Drones and Augmented Reality
This work presents ARD2, a framework that enables real-time through-wall surveillance using two aerial drones and an augmented reality (AR) device. ARD2 consists of two main steps: target direction estimation and contour reconstruction. In the first stage, ARD2 leverages geometric relationships between the drones, the user, and the target to project the target's direction onto the user's AR display. In the second stage, images from the drones are synthesized to reconstruct the target's contour, allowing the user to visualize the target behind walls. Experimental results demonstrate the system's accuracy in both direction estimation and contour reconstruction.
comment: 6 pages
Computational Social Choice: Parameterized Complexity and Challenges
We survey two key problems-Multi-Winner Determination and Hedonic Games in Computational Social Choice, with a special focus on their parameterized complexity, and propose some research challenges in the field.
comment: Submitted to Computer Science Review
Enhancing LLM Trading Performance with Fact-Subjectivity Aware Reasoning
While many studies prove more advanced LLMs perform better on tasks such as math and coding, we notice that in cryptocurrency trading, stronger LLMs work worse than weaker LLMs often. To study how this counter-intuitive phenomenon occurs, we examine the LLM reasoning processes on making trading decisions. We find that separating the reasoning process into factual and subjective components can lead to higher profits. Building on this insight, we introduce a multi-agent framework, FS-ReasoningAgent, which enables LLMs to recognize and learn from both factual and subjective reasoning. Extensive experiments demonstrate that this framework enhances LLM trading performance in cryptocurrency markets. Additionally, an ablation study reveals that relying on subjective news tends to generate higher returns in bull markets, whereas focusing on factual information yields better results in bear markets. Our code and data are available at \url{https://anonymous.4open.science/r/FS-ReasoningAgent-B55F/}.
3D Guidance Law for Flexible Target Enclosing with Inherent Safety
In this paper, we address the problem of enclosing an arbitrarily moving target in three dimensions by a single pursuer while ensuring the pursuer's safety by preventing collisions with the target. The proposed guidance strategy steers the pursuer to a safe region of space surrounding and excluding the target, allowing it to maintain a certain distance from the latter while offering greater flexibility in positioning and converging to any orbit within this safe zone. We leverage the concept of the Lyapunov Barrier Function as a powerful tool to constrain the distance between the pursuer and the target within asymmetric bounds, thereby ensuring the pursuer's safety within the predefined region. Further, we demonstrate the effectiveness of the proposed guidance law in managing arbitrarily maneuvering targets and other uncertainties (such as vehicle/autopilot dynamics and external disturbances) by enabling the pursuer to consistently achieve stable global enclosing behaviors by switching between stable enclosing trajectories within the safe region whenever necessary, even in response to aggressive target maneuvers. To attest to the merits of our work, we conduct experimental tests with various plant models, including a high-fidelity quadrotor model within Software-in-the-loop (SITL) simulations, encompassing various challenging target maneuver scenarios and requiring only relative information for successful execution.
comment: Supplementary video at https://youtu.be/UU704o_966s
Multi-Agent Target Assignment and Path Finding for Intelligent Warehouse: A Cooperative Multi-Agent Deep Reinforcement Learning Perspective
Multi-agent target assignment and path planning (TAPF) are two key problems in intelligent warehouse. However, most literature only addresses one of these two problems separately. In this study, we propose a method to simultaneously solve target assignment and path planning from a perspective of cooperative multi-agent deep reinforcement learning (RL). To the best of our knowledge, this is the first work to model the TAPF problem for intelligent warehouse to cooperative multi-agent deep RL, and the first to simultaneously address TAPF based on multi-agent deep RL. Furthermore, previous literature rarely considers the physical dynamics of agents. In this study, the physical dynamics of the agents is considered. Experimental results show that our method performs well in various task settings, which means that the target assignment is solved reasonably well and the planned path is almost shortest. Moreover, our method is more time-efficient than baselines.
Autonomous Agents for Collaborative Task under Information Asymmetry NeurIPS 2024
Large Language Model Multi-Agent Systems (LLM-MAS) have achieved great progress in solving complex tasks. It performs communication among agents within the system to collaboratively solve tasks, under the premise of shared information. However, when agents' collaborations are leveraged to perform multi-person tasks, a new challenge arises due to information asymmetry, since each agent can only access the information of its human user. Previous MAS struggle to complete tasks under this condition. To address this, we propose a new MAS paradigm termed iAgents, which denotes Informative Multi-Agent Systems. In iAgents, the human social network is mirrored in the agent network, where agents proactively exchange human information necessary for task resolution, thereby overcoming information asymmetry. iAgents employs a novel agent reasoning mechanism, InfoNav, to navigate agents' communication toward effective information exchange. Together with InfoNav, iAgents organizes human information in a mixed memory to provide agents with accurate and comprehensive information for exchange. Additionally, we introduce InformativeBench, the first benchmark tailored for evaluating LLM agents' task-solving ability under information asymmetry. Experimental results show that iAgents can collaborate within a social network of 140 individuals and 588 relationships, autonomously communicate over 30 turns, and retrieve information from nearly 70,000 messages to complete tasks within 3 minutes.
comment: 32 pages, 12 figures, 6 tables, accepted by NeurIPS 2024, see detail at https://thinkwee.top/iagents
Aegis:An Advanced LLM-Based Multi-Agent for Intelligent Functional Safety Engineering
Functional safety is a critical aspect of automotive engineering, encompassing all phases of a vehicle's lifecycle, including design, development, production, operation, and decommissioning. This domain involves highly knowledge-intensive tasks. This paper introduces Aegis: An Advanced LLM-Based Multi-Agent for Intelligent Functional Safety Engineering. Aegis is specifically designed to support complex functional safety tasks within the automotive sector. It is tailored to perform Hazard Analysis and Risk Assessment(HARA), document Functional Safety Requirements(FSR), and plan test cases for Automatic Emergency Braking(AEB) systems. The most advanced version, Aegis-Max, leverages Retrieval-Augmented Generation(RAG) and reflective mechanisms to enhance its capability in managing complex, knowledge-intensive tasks. Additionally, targeted prompt refinement by professional functional safety practitioners can significantly optimize Aegis's performance in the functional safety domain. This paper demonstrates the potential of Aegis to improve the efficiency and effectiveness of functional safety processes in automotive engineering.
Learning a Stable, Safe, Distributed Feedback Controller for a Heterogeneous Platoon of Autonomous Vehicles
Platooning of autonomous vehicles has the potential to increase safety and fuel efficiency on highways. The goal of platooning is to have each vehicle drive at a specified speed (set by the leader) while maintaining a safe distance from its neighbors. Many prior works have analyzed various controllers for platooning, most commonly linear feedback and distributed model predictive controllers. In this work, we introduce an algorithm for learning a stable, safe, distributed controller for a heterogeneous platoon. Our algorithm relies on recent developments in learning neural network stability certificates. We train a controller for autonomous platooning in simulation and evaluate its performance on hardware with a platoon of four F1Tenth vehicles. We then perform further analysis in simulation with a platoon of 100 vehicles. Experimental results demonstrate the practicality of the algorithm and the learned controller by comparing the performance of the neural network controller to linear feedback and distributed model predictive controllers.
comment: Accepted to the International Symposium of Robotics Research (ISRR) 2024
Systems and Control (CS)
Adaptive Subsampling and Learned Model Improve Spatiotemporal Resolution of Tactile Skin
High-speed tactile arrays are essential for real-time robotic control in unstructured environments, but high pixel counts limit readout rates of most large tactile arrays to below 100Hz. We introduce ACTS - adaptive compressive tactile subsampling - a method that efficiently samples tactile matrices and reconstructs interactions using sparse recovery and a learned tactile dictionary. Tested on a 1024-pixel sensor array (32x32), ACTS increased frame rates by 18X compared to raster scanning, with minimal error. For the first time in large-area tactile skin, we demonstrate rapid object classification within 20ms of contact, high-speed projectile detection, ricochet angle estimation, and deformation tracking through enhanced spatiotemporal resolution. Our method can be implemented in firmware, upgrading existing low-cost, flexible, and robust tactile arrays into high-resolution systems for large-area spatiotemporal touch sensing.
comment: 40 pages, 8 main figures, 12 supplemental figures, Videos can be accessed at https://tinyurl.com/TactileSubsampling
Assessing the Optimistic Bias in the Natural Inflow Forecasts: A Call for Model Monitoring in Brazil
Hydroelectricity accounted for roughly 66% of the total generation in Brazil in 2023 and addressed most of the intermittency of wind and solar generation. Thus, one of the most important steps in the operation planning of this country is the forecast of the natural inflow energy (NIE) time series, an approximation of the energetic value of the water inflows. To manage water resources over time, the Brazilian system operator performs long-term forecasts for the NIE to assess the water values through long-term hydrothermal planning models, which are then used to define the short-term merit order in day-ahead scheduling. Therefore, monitoring optimistic bias in NIE forecasts is crucial to prevent an optimistic view of future system conditions and subsequent riskier storage policies. In this article, we investigate and showcase strong evidence of an optimistic bias in the official NIE forecasts, with predicted values consistently exceeding the observations over the past 12 years in the two main subsystems (Southeast and Northeast). Rolling window out-of-sample tests conducted with real data demonstrate that the official forecast model exhibits a statistically significant bias of 6%, 13%, 18%, and 23% for 1, 6, 12, and 24 steps ahead in the Southeast subsystem, and 19%, 57%, 80%, and 108% in the Northeast.
Linear-Threshold Network Models for Describing and Analyzing Brain Dynamics
Over the past two decades, an increasing array of control-theoretic methods have been used to study the brain as a complex dynamical system and better understand its structure-function relationship. This article provides an overview on one such family of methods, based on the linear-threshold rate (LTR) dynamics, which arises when modeling the spiking activity of neuronal populations and their impact on each other. LTR dynamics exhibit a wide range of behaviors based on network topologies and inputs, including mono- and multi-stability, limit cycles, and chaos, allowing it to be used to model many complex brain processes involving fast and slow inhibition, multiple time and spatial scales, different types of neural behavior, and higher-order interactions. Here we investigate how the versatility of LTR dynamics paired with concepts and tools from systems and control can provide a computational theory for explaining the dynamical mechanisms enabling different brain processes. Specifically, we illustrate stability and stabilization properties of LTR dynamics and how they are related to goal-driven selective attention, multistability and its relationship with declarative memory, and bifurcations and oscillations and their role in modeling seizure dynamics in epilepsy. We conclude with a discussion on additional properties of LTR dynamics and an outlook on other brain processess that for which they might be play a similar role.
comment: 62 pages, 16 Figures
Real Eventual Exponential Positivity of Complex-valued Laplacians: Applications to Consensus in Multi-agent Systems
In this paper, we explore the property of eventual exponential positivity (EEP) in complex matrices. We show that this property holds for the real part of the matrix exponential for a certain class of complex matrices. Next, we present the relation between the spectral properties of the Laplacian matrix of an unsigned digraph with complex edge-weights and the property of real EEP. Finally, we show that the Laplacian flow system of a network is stable when the negated Laplacian admits real EEP. Numerical examples are presented to demonstrate the results.
Design of Unitless Normalized Measure of Nonlinearity for State Estimation
The paper deals with measures of nonlinearity. In state estimation, they are utilized i) to select a suitable state estimation algorithm by assessing the nonlinearity of a system model, ii) to adapt the estimation algorithm structure or parameters, or iii) to indicate the possible effect of strong nonlinearity that leads to estimate credibility loss. This paper summarizes the state of the art of nonlinearity measures, focusing on the mean-square-error-based measure of nonlinearity. Its weak point related to unit selection is illustrated, and based on this, requirements for a new measure of nonlinearity are formulated. A new nonlinearity measure that is both unitless and normalized is designed. Its properties are demonstrated using numerical tracking experiments.
comment: Submitted to FUSION 2024 conference
Methodologies for offshore wind power plants stability analysis
The development of larger Offshore Wind Power Plants (OWPPs) is moving towards multi-vendor setups, ultimately aiming to establish Energy hubs. These structures are characterized by installations from different vendors sharing the same connection or closely interconnected points. Control interactions among Wind Turbine (WT) converters and power systems have been detected, and this critical phenomenon can significantly impact the dynamic stability of the system. Various stability analysis methods have been proposed to analyze the interactions between OWPPs at the Point-of-Connection (PoC) and the power system. However, stability studies rarely consider the complex offshore transmission system behind the PoC. Generally, the overall OWPP is blamed for the instability. However, since it is a complex system, it is important to understand which part of the OWPP behind the PoC is causing the problem or is likely to become unstable under certain conditions. Therefore, this paper provides a detailed overview of the advantages and limitations of the current system screening indexes used to design the OWPP, and the stability analysis methods. Each method is discussed, and the appropriate methods, depending on OWPP structure, are evaluated and discussed. The analysis indicates that a combination of time domain and frequency domain methods is necessary for enhancing the definition of stability boundaries.
comment: 15 pages, 9 figures, 4 tables, journal article
Performance Analysis of a Photovoltaic System with Thermoelectric Generator and Phase Change Material; An Experimental Approach
This study explores the integration of thermoelectric generators (TEGs) and phase change materials (PCMs) to enhance the efficiency of photovoltaic (PV) panels in high-temperature conditions. An AP-PM-20 Polycrystalline PV panel, SP-1848-27145 Bismuth Telluride TEG, and paraffin wax PCM in an aluminum container were used. Four configurations were tested: standalone PV, PV-PCM, PV-TEG-PCM, and PV-PCM-TEG, under identical conditions from 10:30 AM to 6:00 PM at 25-minute intervals. Data on PV and TEG voltage, current, and solar irradiance were collected and analyzed. The results show significant performance improvements: the PV-PCM configuration boosted power output by 68.04%, while PV-PCM-TEG and PV-TEG-PCM configurations improved efficiency by 43.06% and 37.51%, respectively. Efficiency gains relative to the standalone PV system were 33.33% for PV-PCM, 25.76% for PV-PCM-TEG, and 21.21% for PV-TEG-PCM, demonstrating the effectiveness of PCMs and TEGs in enhancing PV performance.
comment: This work was presented at the African International Conference on Clean Energy and Energy Storage, 2024
Byzantine-Resilient Output Optimization of Multiagent via Self-Triggered Hybrid Detection Approach
How to achieve precise distributed optimization despite unknown attacks, especially the Byzantine attacks, is one of the critical challenges for multiagent systems. This paper addresses a distributed resilient optimization for linear heterogeneous multi-agent systems faced with adversarial threats. We establish a framework aimed at realizing resilient optimization for continuous-time systems by incorporating a novel self-triggered hybrid detection approach. The proposed hybrid detection approach is able to identify attacks on neighbors using both error thresholds and triggering intervals, thereby optimizing the balance between effective attack detection and the reduction of excessive communication triggers. Through using an edge-based adaptive self-triggered approach, each agent can receive its neighbors' information and determine whether these information is valid. If any neighbor prove invalid, each normal agent will isolate that neighbor by disconnecting communication along that specific edge. Importantly, our adaptive algorithm guarantees the accuracy of the optimization solution even when an agent is isolated by its neighbors.
Cooperative Visual Convex Area Coverage using a Tessellation-free Strategy
The objective in this article is to develop a control strategy for coverage purposes of a convex region by a fleet of Mobile Aerial Agents (MAAs). Each MAA is equipped with a downward facing camera that senses a convex portion of the area while its altitude flight is constrained. Rather than relying on typical Voronoi-like tessellations of the area to be covered, a scheme focusing on the assignment to each MAA of certain parts of the mosaic of the current covered area is proposed. A gradient ascent algorithm is then employed to increase in a monotonic manner the covered area by the MAA-fleet. Simulation studies are offered to illustrate the effectiveness of the proposed scheme.
comment: In proceedings of the 56th Conference on Decision and Control (CDC), 2017. 6 pages, 9 figures, code available at https://git.sr.ht/~sotirisp/uav-coverage. arXiv admin note: substantial text overlap with arXiv:1612.02067
Dynamic Input Mapping Inversion for Algebraic Loop-Free Control in Hydraulic Actuators
The application of nonlinear control schemes to electro-hydraulic actuators often requires several alterations in the design of the controllers during their implementation. This is to overcome the challenges that frequently arise from the inherent complexity of such control algorithms owning to model nonlinearities. Moreover, advanced control solutions for this type of systems often introduce input algebraic loops and chatter, which considerably degrade the tracking performance. This study presents a nonlinear control architecture for hydraulic actuators that comprises low-complexity modules, based on well-established designs that facilitate robust high performance in tracking without introducing the aforementioned limitations. Specifically, the proposed solution consists of two variants of a position controller for the hydraulic cylinder and a dynamic input-mapping inversion module to avoid algebraic loops in the control input. The stability of the closed-loop system is analysed using arguments from Lyapunov theory for cascaded non-autonomous nonlinear systems. The effectiveness of the proposed solution is evaluated on a high-fidelity simulator of a wind turbine pitch system. Appropriate quantitative metrics are finally defined to evaluate the closed-loop system performance in comparison to state-of-the-art nonlinear design.
Railway LiDAR semantic segmentation based on intelligent semi-automated data annotation
Automated vehicles rely on an accurate and robust perception of the environment. Similarly to automated cars, highly automated trains require an environmental perception. Although there is a lot of research based on either camera or LiDAR sensors in the automotive domain, very few contributions for this task exist yet for automated trains. Additionally, no public dataset or described approach for a 3D LiDAR semantic segmentation in the railway environment exists yet. Thus, we propose an approach for a point-wise 3D semantic segmentation based on the 2DPass network architecture using scans and images jointly. In addition, we present a semi-automated intelligent data annotation approach, which we use to efficiently and accurately label the required dataset recorded on a railway track in Germany. To improve performance despite a still small number of labeled scans, we apply an active learning approach to intelligently select scans for the training dataset. Our contributions are threefold: We annotate rail data including camera and LiDAR data from the railway environment, transfer label the raw LiDAR point clouds using an image segmentation network, and train a state-of-the-art 3D LiDAR semantic segmentation network efficiently leveraging active learning. The trained network achieves good segmentation results with a mean IoU of 71.48% of 9 classes.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024
Assessing the techno-economic benefits of LEMs for different grid topologies and prosumer shares
The shift towards decentralized and renewable energy sources has introduced significant challenges to traditional power systems, necessitating innovative market designs. Local energy markets present a viable solution for integrating distributed energy resources such as photovoltaic systems, electric vehicles, and heat pumps within various grid topologies. This study investigates the techno-economic benefits of local energy markets compared to conventional market designs, focusing on their impact on average energy prices and operational peak power, using a self-developed agent-based energy system simulation tool. Through comprehensive simulations across the countryside, rural, suburban, and urban grid topologies with varying penetration levels of the distributed energy resources, totaling 400 simulation setups, we demonstrate that local energy markets can enhance economic efficiency and grid stability with 99 % of the scenarios boasting lower average energy prices and 80 % lower operational peak power levels. Our findings suggest that local energy markets can play a role in the future energy system, especially in areas with high shares of PV and HP, provided that additional infrastructure, management costs, and bureaucratic complexity are kept to a minimum.
comment: 39 pages, 9 figures, 4 tables
A Critical Review of Proton Exchange Membrane Fuel Cells Matter Transports and Voltage Polarisation for Modelling
Technologies based on the use of hydrogen are promising for future energy requirements in a more sustainable world. Consequently, modelling fuel cells is crucial, for instance, to optimize their control to achieve excellent performance, to test new materials and configurations on a limited budget, or to consider their degradation for improved lifespan. To develop such models, a comprehensive study is required, encompassing both well-established and the latest governing laws on matter transport and voltage polarisation for Proton Exchange Membrane Fuel Cells (PEMFCs). Recent articles often rely on outdated or inappropriate equations, lacking clear explanations regarding their background. Indeed, inconsistent understanding of theoretical and experimental choices or model requirements hinders comprehension and contributes to the misuse of these equations. Additionally, specific researches are needed to construct more accurate models. This study aims to offer a comprehensive understanding of the current state-of-the-art in PEMFC modeling. It clarifies the corresponding governing equations, their usage conditions, and assumptions, thus serving as a foundation for future developments. The presented laws and equations are applicable in most multi-dimensional, dynamic, and two-phase PEMFC models.
comment: Journal of The Electrochemical Society, 2024
Coordinated Dispatch of Energy Storage Systems in the Active Distribution Network: A Complementary Reinforcement Learning and Optimization Approach
The complexity and nonlinearity of active distribution network (ADN), coupled with the fast-changing renewable energy (RE), necessitate advanced real-time and safe dispatch approach. This paper proposes a complementary reinforcement learning (RL) and optimization approach, namely SA2CO, to address the coordinated dispatch of the energy storage systems (ESSs) in the ADN. The proposed approach leverages RL's capability to make fast decision and address the model inaccuracies, while optimization methods ensure the ADN security. Furthermore, a hybrid data-driven and expert-experience auxiliary neural network is formulated as a rapid security assessment component in the SA2CO algorithm, enabling dynamic switching between RL and optimization methodologies. Simulation results demonstrate the proposed method's effectiveness and scalability in achieving real-time, safe, and economical dispatch of multiple ESSs in the ADN, surpassing the performance of the state-of-the-art RL and optimization methods.
Optimal Covariance Steering of Linear Stochastic Systems with Hybrid Transitions
This work addresses the problem of optimally steering the state covariance of a linear stochastic system from an initial to a target, subject to hybrid transitions. The nonlinear and discontinuous jump dynamics complicate the control design for hybrid systems. Under uncertainties, stochastic jump timing and state variations further intensify this challenge. This work aims to regulate the hybrid system's state trajectory to stay close to a nominal deterministic one, despite uncertainties and noises. We address this problem by directly controlling state covariances around a mean trajectory, and this problem is termed the Hybrid Covariance Steering (H-CS) problem. The jump dynamics are approximated to the first order by leveraging the Saltation Matrix. When the jump dynamics are nonsingular, we derive an analytical closed-form solution to the H-CS problem. For general jump dynamics with possible singularity and changes in the state dimensions, we reformulate the problem into a convex optimization over path distributions by leveraging Schrodinger's Bridge duality to the smooth covariance control problem. The covariance propagation at hybrid events is enforced as equality constraints to handle singularity issues. The proposed convex framework scales linearly with the number of jump events, ensuring efficient, optimal solutions. This work thus provides a computationally efficient solution to the general H-CS problem. Numerical experiments are conducted to validate the proposed method.
comment: 14 pages
Inverter Output Impedance Estimation in Power Networks: A Variable Direction Forgetting Recursive-Least-Square Algorithm Based Approach
As inverter-based loads and energy sources become increasingly prevalent, accurate line impedance estimation between inverters and the grid is essential for optimizing performance and enhancing control strategies. This paper presents a non-invasive estimation algorithm that avoids signal injection, based on the Variable Direction Forgetting Recursive Least Squares (VDF-RLS) method. The method uses measurement data that is local to the inverter. It proposes a specific method for determining rotational frequency for direct-quadrature (dq) coordinate frame in which data is collected, which ensures a simpler and more accurate estimation. This method is enabled by a secondary Phase Locked Loop (PLL) which appropriately attenuates the effects of variations in grid-voltage measurements. By isolating the variation-sensitive q-axis and relying solely on the less sensitive d-axis, the method further minimizes the impact of variations. The estimation method achieves rapid adaptation while ensuring stability in the absence of persistent excitation by selectively discarding outdated data during updates. Results demonstrate significant improvement (as large as 7 times) in estimation of line parameters, when compared to existing approaches such as constant forgetting RLS.
comment: 8 pages, 6 figures, 1 table, submitted for 2025 American Control Conference (ACC)
Finite-volume method and observability analysis for core-shell enhanced single particle model for lithium iron phosphate batteries
The increasing adoption of Lithium Iron Phosphate (LFP) batteries in Electric Vehicles is driven by their affordability, abundant material supply, and safety advantages. However, challenges arise in controlling/estimating unmeasurable LFP states such as state of charge (SOC), due to its flat open circuit voltage, hysteresis, and path dependence dynamics during intercalation and de-intercalation processes. The Core Shell Average Enhanced Single Particle Model (CSa-ESPM) effectively captures the electrochemical dynamics and phase transition behavior of LFP batteries by means of Partial Differential-Algebraic Equations (PDAEs). These governing PDAEs, including a moving boundary Ordinary Differential Equation (ODE), require a fine-grained spatial grid for accurate and stable solutions when employing the Finite Difference Method (FDM). This, in turn, leads to a computationally expensive system intractable for the design of real-time battery management system algorithms. In this study, we demonstrate that the Finite Volume Method (FVM) effectively discretizes the CSa-ESPM and provides accurate solutions with fewer than 4 control volumes while ensuring mass conservation across multi ple operational cycles. The resulting control-oriented reduced order FVM-based CSa-ESPM is experimentally validated using various C-rate load profiles and its observability is assessed through nonlinear observability analysis. Our results reveal that different current inputs and discrete equation numbers influence model observability, with non-observable regions identified where solid-phase concentration gradients are negligible.
comment: 6 pages, 4 figures
Resilience-Oriented DG Siting and Sizing Considering Energy Equity Constraint
Extreme weather events can cause widespread power outages and huge economic losses. Low-income customers are more vulnerable to power outages because they live in areas with poorly equipped distribution systems. However, existing approaches to improve grid resilience focus on the overall condition of the system and ignore the outage experiences of low-income customers, which leads to significant energy inequities in resilience. Therefore, this paper explores a new resilience-oriented planning method for distributed generator (DG) siting and sizing, by embedding an additional energy equity constraint (EEC). First, the expected load shedding index (ELSI) is defined as the ratio of the load shedding to the original load, which quantifies the resilience-oriented energy equity. Then, the DG siting and sizing problem is formulated as a two-stage stochastic programming with the EEC. The first stage determines the optimal sites and sizes of DG units under investment constraints and EECs, while the second stage optimizes expected costs of unserved load. A subsidiary variable is introduced to ensure the model's solvability. Finally, numerical studies are performed on the IEEE 33-bus and 123-bus systems to verify the effectiveness of the proposed DG planning model in achieving energy equity. Three observations are presented as future guidelines for resilience-oriented DG planning.
A Physics-Based Context-Aware Approach for Anomaly Detection in Teleoperated Driving Operations Under False Data Injection Attacks
Teleoperated driving (ToD) systems are a special type of cyber-physical system (CPS) where the operator remotely controls the steering, acceleration, and braking actions of the vehicle. Malicious actors may inject false data into communication channels to manipulate the teleoperator's driving commands to cause harm. Hence, protection of this communication is necessary for a safe operation of the target vehicle. However, according to the National Institute of Standards and Technology (NIST) cybersecurity framework, protection is not enough, and detecting an attack is necessary. Moreover, UN R155 mandates that vehicle fleets detect and log security incidents. Thus, the cyber-physical threats of ToD are modeled using the attack-centric approach in this paper. Then, an attack model with false data injection (FDI) on the steering control command is created from real vehicle data. A risk of this attack model is assessed for a last-mile delivery (LMD) application. Finally, a physics-based context-aware anomaly detection system (PCADS) is proposed to detect such false injection attacks, and preliminary experimental results are presented to validate the model.
comment: 27 pages, 14 figures, Submitted to IET Intelligent Transport Systems
Islanding Detection for Active Distribution Networks Using WaveNet+UNet Classifier
This paper proposes an AI-based scheme for islanding detection in active distribution networks. By reviewing existing studies, it is clear that there are several gaps in the field to ensure reliable islanding detection, including (i) model complexity and stability concerns, (ii) limited accuracy under noisy conditions, and (iii) limited applicability to systems with different types of resources. Accordingly, this paper proposes a WaveNet classifier reinforced by a denoising U-Net model to address these shortcomings. The proposed scheme has a simple structure due to the use of 1D convolutional layers and incorporates residual connections that significantly enhance the model's generalization. Additionally, the proposed scheme is robust against noisy conditions by incorporating a denoising U-Net model. Furthermore, the model is sufficiently fast using a sliding window time series of 10 milliseconds for detection. Utilizing positive/negative/zero sequence components of voltages, superimposed waveforms, and the rate of change of frequency provides the necessary features to precisely detect the islanding condition. In order to assess the effectiveness of the suggested scheme, over 3k islanding/non-islanding cases were tested, considering different load active/reactive powers values, load switching transients, capacitor bank switching, fault conditions in the main grid, different load quality factors, signal-to-noise ratio levels, and both types of conventional and inverter-based sources.
3D Guidance Law for Flexible Target Enclosing with Inherent Safety
In this paper, we address the problem of enclosing an arbitrarily moving target in three dimensions by a single pursuer while ensuring the pursuer's safety by preventing collisions with the target. The proposed guidance strategy steers the pursuer to a safe region of space surrounding and excluding the target, allowing it to maintain a certain distance from the latter while offering greater flexibility in positioning and converging to any orbit within this safe zone. We leverage the concept of the Lyapunov Barrier Function as a powerful tool to constrain the distance between the pursuer and the target within asymmetric bounds, thereby ensuring the pursuer's safety within the predefined region. Further, we demonstrate the effectiveness of the proposed guidance law in managing arbitrarily maneuvering targets and other uncertainties (such as vehicle/autopilot dynamics and external disturbances) by enabling the pursuer to consistently achieve stable global enclosing behaviors by switching between stable enclosing trajectories within the safe region whenever necessary, even in response to aggressive target maneuvers. To attest to the merits of our work, we conduct experimental tests with various plant models, including a high-fidelity quadrotor model within Software-in-the-loop (SITL) simulations, encompassing various challenging target maneuver scenarios and requiring only relative information for successful execution.
comment: Supplementary video at https://youtu.be/UU704o_966s
Chattering Phenomena in Time-Optimal Control for High-Order Chain-of-Integrator Systems with Full State Constraints (Extended Version)
Time-optimal control for high-order chain-of-integrator systems with full state constraints remains an open and challenging problem within the discipline of optimal control. The behavior of optimal control in high-order problems lacks precise characterization, and even the existence of the chattering phenomenon, i.e., the control switches for infinitely many times over a finite period, remains unknown and overlooked. This paper establishes a theoretical framework for chattering phenomena in the considered problem, providing novel findings on the uniqueness of state constraints inducing chattering, the upper bound of switching times in an unconstrained arc during chattering, and the convergence of states and costates to the chattering limit point. For the first time, this paper proves the existence of the chattering phenomenon in the considered problem. The chattering optimal control for 4th-order problems with velocity constraints is precisely solved, providing an approach to plan time-optimal snap-limited trajectories. Other cases of order $n\leq4$ are proved not to allow chattering. The conclusions rectify a longstanding misconception in the industry concerning the time-optimality of S-shaped trajectories with minimal switching times.
Improved Small-Signal L2 Gain Analysis for Nonlinear Systems
TheL2-gain characterizes a dynamical system's input-output properties, but can be difficult to determine for nonlinear systems. Previous work designed a nonconvex optimization problem to simultaneously search for a continuous piecewise affine (CPA) storage function and an upper bound on the small-signal L2-gain of a dynamical system over a triangulated region about the origin. This work improves upon those results by establishing a tighter upper-bound on a system's gain using a convex optimization problem. By reformulating the relationship between the Hamilton-Jacobi inequality and L2-gain as a linear matrix inequality and then developing novel LMI error bounds for a triangulation, tighter gain bounds are derived and computed more efficiently. Additionally, a combined quadratic and CPA storage function is considered to expand the nonlinear systems this optimization problem is applicable to. Numerical results demonstrate the tighter upper bound on a dynamical system's gain.
Concurrent Design Optimization of Powertrain Component Modules in a Family of Electric Vehicles
We present a modeling and optimization framework to design powertrains for a family of electric vehicles, focusing on the concurrent sizing of their motors and batteries. Whilst tailoring these component modules to each individual vehicle type can minimize energy consumption, it can result in high production costs due to the variety of component modules to be realized for the family of vehicles, driving the Total Costs of Ownership (TCO) high. Against this backdrop, we explore modularity and standardization strategies whereby we jointly design unique motor and battery modules to be installed in all the vehicles in the family, using a different number of these modules when needed. Such an approach results in higher production volumes of the same component module, entailing significantly lower manufacturing costs due to Economy-of-Scale (EoS) effects, and hence a potentially lower TCO for the family of vehicles. To solve the resulting one-size-fits-all problem, we instantiate a nested framework consisting of an inner convex optimization routine which jointly optimizes the modules' sizes and the powertrain operation of the entire family, for given driving cycles and modules' multiplicities. Likewise, we devise an outer loop comparing each configuration to identify the minimum-TCO solution with global optimality guarantees. Finally, we showcase our framework on a case study for the Tesla vehicle family in a benchmark design problem, considering the Model S, Model 3, Model X, and Model Y. Our results show that, compared to an individually tailored design, the application of our concurrent design optimization framework achieves a significant reduction of the production costs for a minimal increase in operational costs, ultimately lowering the family TCO in the benchmark design problem by 3.5\%.
comment: 17 pages, 17 figures, 7 tables
Online Linear Quadratic Tracking with Regret Guarantees
Online learning algorithms for dynamical systems provide finite time guarantees for control in the presence of sequentially revealed cost functions. We pose the classical linear quadratic tracking problem in the framework of online optimization where the time-varying reference state is unknown a priori and is revealed after the applied control input. We show the equivalence of this problem to the control of linear systems subject to adversarial disturbances and propose a novel online gradient descent based algorithm to achieve efficient tracking in finite time. We provide a dynamic regret upper bound scaling linearly with the path length of the reference trajectory and a numerical example to corroborate the theoretical guarantees.
comment: Published at the IEEE Control Systems Letters
Fault Diagnosis and Prognosis Capabilities for Wind Turbine Hydraulic Pitch Systems
Wind energy is the leading non-hydro renewable technology. Increasing reliability is a key factor in reducing the downtime of high-power wind turbines installed in remote off-shore places, where maintenance is costly and less reactive. Defects in the pitch system are responsible for up to 20% of a wind turbine downtime.Thus, monitoring such defects is essential for avoiding it. This paper presents a generic assessment of the diagnosis capabilities in hydraulic pitch systems, which are used in high-power wind turbines. A mathematical model of the non-linear system dynamics is presented along with a description of the most frequent faults that occur. Structural analysis is used to assess which defects can be detected in the pitch system. The structural properties are furthermore explored to investigate the possibility of reducing the amount of sensors without compromising the fault diagnosis capabilities. Robustness to model uncertainty is finally addressed and generic principles for estimating the detectable magnitude of wear and tear are presented.
Deep DeePC: Data-enabled predictive control with low or no online optimization using deep learning
Data-enabled predictive control (DeePC) is a data-driven control algorithm that utilizes data matrices to form a non-parametric representation of the underlying system, predicting future behaviors and generating optimal control actions. DeePC typically requires solving an online optimization problem, the complexity of which is heavily influenced by the amount of data used, potentially leading to expensive online computation. In this paper, we leverage deep learning to propose a highly computationally efficient DeePC approach for general nonlinear processes, referred to as Deep DeePC. Specifically, a deep neural network is employed to learn the DeePC vector operator, which is an essential component of the non-parametric representation of DeePC. This neural network is trained offline using historical open-loop input and output data of the nonlinear process. With the trained neural network, the Deep DeePC framework is formed for online control implementation. At each sampling instant, this neural network directly outputs the DeePC operator, eliminating the need for online optimization as conventional DeePC. The optimal control action is obtained based on the DeePC operator updated by the trained neural network. To address constrained scenarios, a constraint handling scheme is further proposed and integrated with the Deep DeePC to handle hard constraints during online implementation. The efficacy and superiority of the proposed Deep DeePC approach are demonstrated using two benchmark process examples.
comment: 34 pages, 7 figures
Multi-Objective Learning Model Predictive Control
Multi-Objective Learning Model Predictive Control is a novel data-driven control scheme which improves a linear system's closed-loop performance with respect to several convex control objectives over iterations of a repeated task. At each task iteration, collected system data is used to construct terminal components of a Model Predictive Controller. The formulation presented in this paper ensures that closed-loop control performance improves between successive iterations with respect to each objective. We provide proofs of recursive feasibility and performance improvement, and show that the converged policy is Pareto optimal. Simulation results demonstrate the applicability of the proposed approach.
Learning a Stable, Safe, Distributed Feedback Controller for a Heterogeneous Platoon of Autonomous Vehicles
Platooning of autonomous vehicles has the potential to increase safety and fuel efficiency on highways. The goal of platooning is to have each vehicle drive at a specified speed (set by the leader) while maintaining a safe distance from its neighbors. Many prior works have analyzed various controllers for platooning, most commonly linear feedback and distributed model predictive controllers. In this work, we introduce an algorithm for learning a stable, safe, distributed controller for a heterogeneous platoon. Our algorithm relies on recent developments in learning neural network stability certificates. We train a controller for autonomous platooning in simulation and evaluate its performance on hardware with a platoon of four F1Tenth vehicles. We then perform further analysis in simulation with a platoon of 100 vehicles. Experimental results demonstrate the practicality of the algorithm and the learned controller by comparing the performance of the neural network controller to linear feedback and distributed model predictive controllers.
comment: Accepted to the International Symposium of Robotics Research (ISRR) 2024
Experiences with Sub-Arctic Sensor Network Deployment
This paper discusses the experiences gained from designing, deploying and maintaining low-power wireless sensor networks in three geothermally active remote locations in Iceland. The network was deployed to assist researchers in collecting soil temperature data which would help them investigate the impact of global warming on (sub)Arctic climate and subsequent carbon release. Functional networks with more than 50 sensor nodes from three sites with no direct access to power and the Internet have been providing researchers insight into the warming impacts since 2021. The network employs low-power primary cell-powered wireless sensor nodes equipped with DASH7 communication protocol and solar-powered DASH7-cellular gateways, providing real-time data and remote access to sensors and devices deployed in the field. We present a detailed discussion of different network components, their architecture, and the network's overall performance and reliability.
comment: 8 Figures, 6 pages
Systems and Control (EESS)
Adaptive Subsampling and Learned Model Improve Spatiotemporal Resolution of Tactile Skin
High-speed tactile arrays are essential for real-time robotic control in unstructured environments, but high pixel counts limit readout rates of most large tactile arrays to below 100Hz. We introduce ACTS - adaptive compressive tactile subsampling - a method that efficiently samples tactile matrices and reconstructs interactions using sparse recovery and a learned tactile dictionary. Tested on a 1024-pixel sensor array (32x32), ACTS increased frame rates by 18X compared to raster scanning, with minimal error. For the first time in large-area tactile skin, we demonstrate rapid object classification within 20ms of contact, high-speed projectile detection, ricochet angle estimation, and deformation tracking through enhanced spatiotemporal resolution. Our method can be implemented in firmware, upgrading existing low-cost, flexible, and robust tactile arrays into high-resolution systems for large-area spatiotemporal touch sensing.
comment: 40 pages, 8 main figures, 12 supplemental figures, Videos can be accessed at https://tinyurl.com/TactileSubsampling
Assessing the Optimistic Bias in the Natural Inflow Forecasts: A Call for Model Monitoring in Brazil
Hydroelectricity accounted for roughly 66% of the total generation in Brazil in 2023 and addressed most of the intermittency of wind and solar generation. Thus, one of the most important steps in the operation planning of this country is the forecast of the natural inflow energy (NIE) time series, an approximation of the energetic value of the water inflows. To manage water resources over time, the Brazilian system operator performs long-term forecasts for the NIE to assess the water values through long-term hydrothermal planning models, which are then used to define the short-term merit order in day-ahead scheduling. Therefore, monitoring optimistic bias in NIE forecasts is crucial to prevent an optimistic view of future system conditions and subsequent riskier storage policies. In this article, we investigate and showcase strong evidence of an optimistic bias in the official NIE forecasts, with predicted values consistently exceeding the observations over the past 12 years in the two main subsystems (Southeast and Northeast). Rolling window out-of-sample tests conducted with real data demonstrate that the official forecast model exhibits a statistically significant bias of 6%, 13%, 18%, and 23% for 1, 6, 12, and 24 steps ahead in the Southeast subsystem, and 19%, 57%, 80%, and 108% in the Northeast.
Linear-Threshold Network Models for Describing and Analyzing Brain Dynamics
Over the past two decades, an increasing array of control-theoretic methods have been used to study the brain as a complex dynamical system and better understand its structure-function relationship. This article provides an overview on one such family of methods, based on the linear-threshold rate (LTR) dynamics, which arises when modeling the spiking activity of neuronal populations and their impact on each other. LTR dynamics exhibit a wide range of behaviors based on network topologies and inputs, including mono- and multi-stability, limit cycles, and chaos, allowing it to be used to model many complex brain processes involving fast and slow inhibition, multiple time and spatial scales, different types of neural behavior, and higher-order interactions. Here we investigate how the versatility of LTR dynamics paired with concepts and tools from systems and control can provide a computational theory for explaining the dynamical mechanisms enabling different brain processes. Specifically, we illustrate stability and stabilization properties of LTR dynamics and how they are related to goal-driven selective attention, multistability and its relationship with declarative memory, and bifurcations and oscillations and their role in modeling seizure dynamics in epilepsy. We conclude with a discussion on additional properties of LTR dynamics and an outlook on other brain processess that for which they might be play a similar role.
comment: 62 pages, 16 Figures
Real Eventual Exponential Positivity of Complex-valued Laplacians: Applications to Consensus in Multi-agent Systems
In this paper, we explore the property of eventual exponential positivity (EEP) in complex matrices. We show that this property holds for the real part of the matrix exponential for a certain class of complex matrices. Next, we present the relation between the spectral properties of the Laplacian matrix of an unsigned digraph with complex edge-weights and the property of real EEP. Finally, we show that the Laplacian flow system of a network is stable when the negated Laplacian admits real EEP. Numerical examples are presented to demonstrate the results.
Design of Unitless Normalized Measure of Nonlinearity for State Estimation
The paper deals with measures of nonlinearity. In state estimation, they are utilized i) to select a suitable state estimation algorithm by assessing the nonlinearity of a system model, ii) to adapt the estimation algorithm structure or parameters, or iii) to indicate the possible effect of strong nonlinearity that leads to estimate credibility loss. This paper summarizes the state of the art of nonlinearity measures, focusing on the mean-square-error-based measure of nonlinearity. Its weak point related to unit selection is illustrated, and based on this, requirements for a new measure of nonlinearity are formulated. A new nonlinearity measure that is both unitless and normalized is designed. Its properties are demonstrated using numerical tracking experiments.
comment: Submitted to FUSION 2024 conference
Methodologies for offshore wind power plants stability analysis
The development of larger Offshore Wind Power Plants (OWPPs) is moving towards multi-vendor setups, ultimately aiming to establish Energy hubs. These structures are characterized by installations from different vendors sharing the same connection or closely interconnected points. Control interactions among Wind Turbine (WT) converters and power systems have been detected, and this critical phenomenon can significantly impact the dynamic stability of the system. Various stability analysis methods have been proposed to analyze the interactions between OWPPs at the Point-of-Connection (PoC) and the power system. However, stability studies rarely consider the complex offshore transmission system behind the PoC. Generally, the overall OWPP is blamed for the instability. However, since it is a complex system, it is important to understand which part of the OWPP behind the PoC is causing the problem or is likely to become unstable under certain conditions. Therefore, this paper provides a detailed overview of the advantages and limitations of the current system screening indexes used to design the OWPP, and the stability analysis methods. Each method is discussed, and the appropriate methods, depending on OWPP structure, are evaluated and discussed. The analysis indicates that a combination of time domain and frequency domain methods is necessary for enhancing the definition of stability boundaries.
comment: 15 pages, 9 figures, 4 tables, journal article
Performance Analysis of a Photovoltaic System with Thermoelectric Generator and Phase Change Material; An Experimental Approach
This study explores the integration of thermoelectric generators (TEGs) and phase change materials (PCMs) to enhance the efficiency of photovoltaic (PV) panels in high-temperature conditions. An AP-PM-20 Polycrystalline PV panel, SP-1848-27145 Bismuth Telluride TEG, and paraffin wax PCM in an aluminum container were used. Four configurations were tested: standalone PV, PV-PCM, PV-TEG-PCM, and PV-PCM-TEG, under identical conditions from 10:30 AM to 6:00 PM at 25-minute intervals. Data on PV and TEG voltage, current, and solar irradiance were collected and analyzed. The results show significant performance improvements: the PV-PCM configuration boosted power output by 68.04%, while PV-PCM-TEG and PV-TEG-PCM configurations improved efficiency by 43.06% and 37.51%, respectively. Efficiency gains relative to the standalone PV system were 33.33% for PV-PCM, 25.76% for PV-PCM-TEG, and 21.21% for PV-TEG-PCM, demonstrating the effectiveness of PCMs and TEGs in enhancing PV performance.
comment: This work was presented at the African International Conference on Clean Energy and Energy Storage, 2024
Byzantine-Resilient Output Optimization of Multiagent via Self-Triggered Hybrid Detection Approach
How to achieve precise distributed optimization despite unknown attacks, especially the Byzantine attacks, is one of the critical challenges for multiagent systems. This paper addresses a distributed resilient optimization for linear heterogeneous multi-agent systems faced with adversarial threats. We establish a framework aimed at realizing resilient optimization for continuous-time systems by incorporating a novel self-triggered hybrid detection approach. The proposed hybrid detection approach is able to identify attacks on neighbors using both error thresholds and triggering intervals, thereby optimizing the balance between effective attack detection and the reduction of excessive communication triggers. Through using an edge-based adaptive self-triggered approach, each agent can receive its neighbors' information and determine whether these information is valid. If any neighbor prove invalid, each normal agent will isolate that neighbor by disconnecting communication along that specific edge. Importantly, our adaptive algorithm guarantees the accuracy of the optimization solution even when an agent is isolated by its neighbors.
Cooperative Visual Convex Area Coverage using a Tessellation-free Strategy
The objective in this article is to develop a control strategy for coverage purposes of a convex region by a fleet of Mobile Aerial Agents (MAAs). Each MAA is equipped with a downward facing camera that senses a convex portion of the area while its altitude flight is constrained. Rather than relying on typical Voronoi-like tessellations of the area to be covered, a scheme focusing on the assignment to each MAA of certain parts of the mosaic of the current covered area is proposed. A gradient ascent algorithm is then employed to increase in a monotonic manner the covered area by the MAA-fleet. Simulation studies are offered to illustrate the effectiveness of the proposed scheme.
comment: In proceedings of the 56th Conference on Decision and Control (CDC), 2017. 6 pages, 9 figures, code available at https://git.sr.ht/~sotirisp/uav-coverage. arXiv admin note: substantial text overlap with arXiv:1612.02067
Dynamic Input Mapping Inversion for Algebraic Loop-Free Control in Hydraulic Actuators
The application of nonlinear control schemes to electro-hydraulic actuators often requires several alterations in the design of the controllers during their implementation. This is to overcome the challenges that frequently arise from the inherent complexity of such control algorithms owning to model nonlinearities. Moreover, advanced control solutions for this type of systems often introduce input algebraic loops and chatter, which considerably degrade the tracking performance. This study presents a nonlinear control architecture for hydraulic actuators that comprises low-complexity modules, based on well-established designs that facilitate robust high performance in tracking without introducing the aforementioned limitations. Specifically, the proposed solution consists of two variants of a position controller for the hydraulic cylinder and a dynamic input-mapping inversion module to avoid algebraic loops in the control input. The stability of the closed-loop system is analysed using arguments from Lyapunov theory for cascaded non-autonomous nonlinear systems. The effectiveness of the proposed solution is evaluated on a high-fidelity simulator of a wind turbine pitch system. Appropriate quantitative metrics are finally defined to evaluate the closed-loop system performance in comparison to state-of-the-art nonlinear design.
Railway LiDAR semantic segmentation based on intelligent semi-automated data annotation
Automated vehicles rely on an accurate and robust perception of the environment. Similarly to automated cars, highly automated trains require an environmental perception. Although there is a lot of research based on either camera or LiDAR sensors in the automotive domain, very few contributions for this task exist yet for automated trains. Additionally, no public dataset or described approach for a 3D LiDAR semantic segmentation in the railway environment exists yet. Thus, we propose an approach for a point-wise 3D semantic segmentation based on the 2DPass network architecture using scans and images jointly. In addition, we present a semi-automated intelligent data annotation approach, which we use to efficiently and accurately label the required dataset recorded on a railway track in Germany. To improve performance despite a still small number of labeled scans, we apply an active learning approach to intelligently select scans for the training dataset. Our contributions are threefold: We annotate rail data including camera and LiDAR data from the railway environment, transfer label the raw LiDAR point clouds using an image segmentation network, and train a state-of-the-art 3D LiDAR semantic segmentation network efficiently leveraging active learning. The trained network achieves good segmentation results with a mean IoU of 71.48% of 9 classes.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024
Assessing the techno-economic benefits of LEMs for different grid topologies and prosumer shares
The shift towards decentralized and renewable energy sources has introduced significant challenges to traditional power systems, necessitating innovative market designs. Local energy markets present a viable solution for integrating distributed energy resources such as photovoltaic systems, electric vehicles, and heat pumps within various grid topologies. This study investigates the techno-economic benefits of local energy markets compared to conventional market designs, focusing on their impact on average energy prices and operational peak power, using a self-developed agent-based energy system simulation tool. Through comprehensive simulations across the countryside, rural, suburban, and urban grid topologies with varying penetration levels of the distributed energy resources, totaling 400 simulation setups, we demonstrate that local energy markets can enhance economic efficiency and grid stability with 99 % of the scenarios boasting lower average energy prices and 80 % lower operational peak power levels. Our findings suggest that local energy markets can play a role in the future energy system, especially in areas with high shares of PV and HP, provided that additional infrastructure, management costs, and bureaucratic complexity are kept to a minimum.
comment: 39 pages, 9 figures, 4 tables
A Critical Review of Proton Exchange Membrane Fuel Cells Matter Transports and Voltage Polarisation for Modelling
Technologies based on the use of hydrogen are promising for future energy requirements in a more sustainable world. Consequently, modelling fuel cells is crucial, for instance, to optimize their control to achieve excellent performance, to test new materials and configurations on a limited budget, or to consider their degradation for improved lifespan. To develop such models, a comprehensive study is required, encompassing both well-established and the latest governing laws on matter transport and voltage polarisation for Proton Exchange Membrane Fuel Cells (PEMFCs). Recent articles often rely on outdated or inappropriate equations, lacking clear explanations regarding their background. Indeed, inconsistent understanding of theoretical and experimental choices or model requirements hinders comprehension and contributes to the misuse of these equations. Additionally, specific researches are needed to construct more accurate models. This study aims to offer a comprehensive understanding of the current state-of-the-art in PEMFC modeling. It clarifies the corresponding governing equations, their usage conditions, and assumptions, thus serving as a foundation for future developments. The presented laws and equations are applicable in most multi-dimensional, dynamic, and two-phase PEMFC models.
comment: Journal of The Electrochemical Society, 2024
Coordinated Dispatch of Energy Storage Systems in the Active Distribution Network: A Complementary Reinforcement Learning and Optimization Approach
The complexity and nonlinearity of active distribution network (ADN), coupled with the fast-changing renewable energy (RE), necessitate advanced real-time and safe dispatch approach. This paper proposes a complementary reinforcement learning (RL) and optimization approach, namely SA2CO, to address the coordinated dispatch of the energy storage systems (ESSs) in the ADN. The proposed approach leverages RL's capability to make fast decision and address the model inaccuracies, while optimization methods ensure the ADN security. Furthermore, a hybrid data-driven and expert-experience auxiliary neural network is formulated as a rapid security assessment component in the SA2CO algorithm, enabling dynamic switching between RL and optimization methodologies. Simulation results demonstrate the proposed method's effectiveness and scalability in achieving real-time, safe, and economical dispatch of multiple ESSs in the ADN, surpassing the performance of the state-of-the-art RL and optimization methods.
Optimal Covariance Steering of Linear Stochastic Systems with Hybrid Transitions
This work addresses the problem of optimally steering the state covariance of a linear stochastic system from an initial to a target, subject to hybrid transitions. The nonlinear and discontinuous jump dynamics complicate the control design for hybrid systems. Under uncertainties, stochastic jump timing and state variations further intensify this challenge. This work aims to regulate the hybrid system's state trajectory to stay close to a nominal deterministic one, despite uncertainties and noises. We address this problem by directly controlling state covariances around a mean trajectory, and this problem is termed the Hybrid Covariance Steering (H-CS) problem. The jump dynamics are approximated to the first order by leveraging the Saltation Matrix. When the jump dynamics are nonsingular, we derive an analytical closed-form solution to the H-CS problem. For general jump dynamics with possible singularity and changes in the state dimensions, we reformulate the problem into a convex optimization over path distributions by leveraging Schrodinger's Bridge duality to the smooth covariance control problem. The covariance propagation at hybrid events is enforced as equality constraints to handle singularity issues. The proposed convex framework scales linearly with the number of jump events, ensuring efficient, optimal solutions. This work thus provides a computationally efficient solution to the general H-CS problem. Numerical experiments are conducted to validate the proposed method.
comment: 14 pages
Inverter Output Impedance Estimation in Power Networks: A Variable Direction Forgetting Recursive-Least-Square Algorithm Based Approach
As inverter-based loads and energy sources become increasingly prevalent, accurate line impedance estimation between inverters and the grid is essential for optimizing performance and enhancing control strategies. This paper presents a non-invasive estimation algorithm that avoids signal injection, based on the Variable Direction Forgetting Recursive Least Squares (VDF-RLS) method. The method uses measurement data that is local to the inverter. It proposes a specific method for determining rotational frequency for direct-quadrature (dq) coordinate frame in which data is collected, which ensures a simpler and more accurate estimation. This method is enabled by a secondary Phase Locked Loop (PLL) which appropriately attenuates the effects of variations in grid-voltage measurements. By isolating the variation-sensitive q-axis and relying solely on the less sensitive d-axis, the method further minimizes the impact of variations. The estimation method achieves rapid adaptation while ensuring stability in the absence of persistent excitation by selectively discarding outdated data during updates. Results demonstrate significant improvement (as large as 7 times) in estimation of line parameters, when compared to existing approaches such as constant forgetting RLS.
comment: 8 pages, 6 figures, 1 table, submitted for 2025 American Control Conference (ACC)
Finite-volume method and observability analysis for core-shell enhanced single particle model for lithium iron phosphate batteries
The increasing adoption of Lithium Iron Phosphate (LFP) batteries in Electric Vehicles is driven by their affordability, abundant material supply, and safety advantages. However, challenges arise in controlling/estimating unmeasurable LFP states such as state of charge (SOC), due to its flat open circuit voltage, hysteresis, and path dependence dynamics during intercalation and de-intercalation processes. The Core Shell Average Enhanced Single Particle Model (CSa-ESPM) effectively captures the electrochemical dynamics and phase transition behavior of LFP batteries by means of Partial Differential-Algebraic Equations (PDAEs). These governing PDAEs, including a moving boundary Ordinary Differential Equation (ODE), require a fine-grained spatial grid for accurate and stable solutions when employing the Finite Difference Method (FDM). This, in turn, leads to a computationally expensive system intractable for the design of real-time battery management system algorithms. In this study, we demonstrate that the Finite Volume Method (FVM) effectively discretizes the CSa-ESPM and provides accurate solutions with fewer than 4 control volumes while ensuring mass conservation across multi ple operational cycles. The resulting control-oriented reduced order FVM-based CSa-ESPM is experimentally validated using various C-rate load profiles and its observability is assessed through nonlinear observability analysis. Our results reveal that different current inputs and discrete equation numbers influence model observability, with non-observable regions identified where solid-phase concentration gradients are negligible.
comment: 6 pages, 4 figures
Resilience-Oriented DG Siting and Sizing Considering Energy Equity Constraint
Extreme weather events can cause widespread power outages and huge economic losses. Low-income customers are more vulnerable to power outages because they live in areas with poorly equipped distribution systems. However, existing approaches to improve grid resilience focus on the overall condition of the system and ignore the outage experiences of low-income customers, which leads to significant energy inequities in resilience. Therefore, this paper explores a new resilience-oriented planning method for distributed generator (DG) siting and sizing, by embedding an additional energy equity constraint (EEC). First, the expected load shedding index (ELSI) is defined as the ratio of the load shedding to the original load, which quantifies the resilience-oriented energy equity. Then, the DG siting and sizing problem is formulated as a two-stage stochastic programming with the EEC. The first stage determines the optimal sites and sizes of DG units under investment constraints and EECs, while the second stage optimizes expected costs of unserved load. A subsidiary variable is introduced to ensure the model's solvability. Finally, numerical studies are performed on the IEEE 33-bus and 123-bus systems to verify the effectiveness of the proposed DG planning model in achieving energy equity. Three observations are presented as future guidelines for resilience-oriented DG planning.
A Physics-Based Context-Aware Approach for Anomaly Detection in Teleoperated Driving Operations Under False Data Injection Attacks
Teleoperated driving (ToD) systems are a special type of cyber-physical system (CPS) where the operator remotely controls the steering, acceleration, and braking actions of the vehicle. Malicious actors may inject false data into communication channels to manipulate the teleoperator's driving commands to cause harm. Hence, protection of this communication is necessary for a safe operation of the target vehicle. However, according to the National Institute of Standards and Technology (NIST) cybersecurity framework, protection is not enough, and detecting an attack is necessary. Moreover, UN R155 mandates that vehicle fleets detect and log security incidents. Thus, the cyber-physical threats of ToD are modeled using the attack-centric approach in this paper. Then, an attack model with false data injection (FDI) on the steering control command is created from real vehicle data. A risk of this attack model is assessed for a last-mile delivery (LMD) application. Finally, a physics-based context-aware anomaly detection system (PCADS) is proposed to detect such false injection attacks, and preliminary experimental results are presented to validate the model.
comment: 27 pages, 14 figures, Submitted to IET Intelligent Transport Systems
Islanding Detection for Active Distribution Networks Using WaveNet+UNet Classifier
This paper proposes an AI-based scheme for islanding detection in active distribution networks. By reviewing existing studies, it is clear that there are several gaps in the field to ensure reliable islanding detection, including (i) model complexity and stability concerns, (ii) limited accuracy under noisy conditions, and (iii) limited applicability to systems with different types of resources. Accordingly, this paper proposes a WaveNet classifier reinforced by a denoising U-Net model to address these shortcomings. The proposed scheme has a simple structure due to the use of 1D convolutional layers and incorporates residual connections that significantly enhance the model's generalization. Additionally, the proposed scheme is robust against noisy conditions by incorporating a denoising U-Net model. Furthermore, the model is sufficiently fast using a sliding window time series of 10 milliseconds for detection. Utilizing positive/negative/zero sequence components of voltages, superimposed waveforms, and the rate of change of frequency provides the necessary features to precisely detect the islanding condition. In order to assess the effectiveness of the suggested scheme, over 3k islanding/non-islanding cases were tested, considering different load active/reactive powers values, load switching transients, capacitor bank switching, fault conditions in the main grid, different load quality factors, signal-to-noise ratio levels, and both types of conventional and inverter-based sources.
3D Guidance Law for Flexible Target Enclosing with Inherent Safety
In this paper, we address the problem of enclosing an arbitrarily moving target in three dimensions by a single pursuer while ensuring the pursuer's safety by preventing collisions with the target. The proposed guidance strategy steers the pursuer to a safe region of space surrounding and excluding the target, allowing it to maintain a certain distance from the latter while offering greater flexibility in positioning and converging to any orbit within this safe zone. We leverage the concept of the Lyapunov Barrier Function as a powerful tool to constrain the distance between the pursuer and the target within asymmetric bounds, thereby ensuring the pursuer's safety within the predefined region. Further, we demonstrate the effectiveness of the proposed guidance law in managing arbitrarily maneuvering targets and other uncertainties (such as vehicle/autopilot dynamics and external disturbances) by enabling the pursuer to consistently achieve stable global enclosing behaviors by switching between stable enclosing trajectories within the safe region whenever necessary, even in response to aggressive target maneuvers. To attest to the merits of our work, we conduct experimental tests with various plant models, including a high-fidelity quadrotor model within Software-in-the-loop (SITL) simulations, encompassing various challenging target maneuver scenarios and requiring only relative information for successful execution.
comment: Supplementary video at https://youtu.be/UU704o_966s
Chattering Phenomena in Time-Optimal Control for High-Order Chain-of-Integrator Systems with Full State Constraints (Extended Version)
Time-optimal control for high-order chain-of-integrator systems with full state constraints remains an open and challenging problem within the discipline of optimal control. The behavior of optimal control in high-order problems lacks precise characterization, and even the existence of the chattering phenomenon, i.e., the control switches for infinitely many times over a finite period, remains unknown and overlooked. This paper establishes a theoretical framework for chattering phenomena in the considered problem, providing novel findings on the uniqueness of state constraints inducing chattering, the upper bound of switching times in an unconstrained arc during chattering, and the convergence of states and costates to the chattering limit point. For the first time, this paper proves the existence of the chattering phenomenon in the considered problem. The chattering optimal control for 4th-order problems with velocity constraints is precisely solved, providing an approach to plan time-optimal snap-limited trajectories. Other cases of order $n\leq4$ are proved not to allow chattering. The conclusions rectify a longstanding misconception in the industry concerning the time-optimality of S-shaped trajectories with minimal switching times.
Improved Small-Signal L2 Gain Analysis for Nonlinear Systems
TheL2-gain characterizes a dynamical system's input-output properties, but can be difficult to determine for nonlinear systems. Previous work designed a nonconvex optimization problem to simultaneously search for a continuous piecewise affine (CPA) storage function and an upper bound on the small-signal L2-gain of a dynamical system over a triangulated region about the origin. This work improves upon those results by establishing a tighter upper-bound on a system's gain using a convex optimization problem. By reformulating the relationship between the Hamilton-Jacobi inequality and L2-gain as a linear matrix inequality and then developing novel LMI error bounds for a triangulation, tighter gain bounds are derived and computed more efficiently. Additionally, a combined quadratic and CPA storage function is considered to expand the nonlinear systems this optimization problem is applicable to. Numerical results demonstrate the tighter upper bound on a dynamical system's gain.
Concurrent Design Optimization of Powertrain Component Modules in a Family of Electric Vehicles
We present a modeling and optimization framework to design powertrains for a family of electric vehicles, focusing on the concurrent sizing of their motors and batteries. Whilst tailoring these component modules to each individual vehicle type can minimize energy consumption, it can result in high production costs due to the variety of component modules to be realized for the family of vehicles, driving the Total Costs of Ownership (TCO) high. Against this backdrop, we explore modularity and standardization strategies whereby we jointly design unique motor and battery modules to be installed in all the vehicles in the family, using a different number of these modules when needed. Such an approach results in higher production volumes of the same component module, entailing significantly lower manufacturing costs due to Economy-of-Scale (EoS) effects, and hence a potentially lower TCO for the family of vehicles. To solve the resulting one-size-fits-all problem, we instantiate a nested framework consisting of an inner convex optimization routine which jointly optimizes the modules' sizes and the powertrain operation of the entire family, for given driving cycles and modules' multiplicities. Likewise, we devise an outer loop comparing each configuration to identify the minimum-TCO solution with global optimality guarantees. Finally, we showcase our framework on a case study for the Tesla vehicle family in a benchmark design problem, considering the Model S, Model 3, Model X, and Model Y. Our results show that, compared to an individually tailored design, the application of our concurrent design optimization framework achieves a significant reduction of the production costs for a minimal increase in operational costs, ultimately lowering the family TCO in the benchmark design problem by 3.5\%.
comment: 17 pages, 17 figures, 7 tables
Online Linear Quadratic Tracking with Regret Guarantees
Online learning algorithms for dynamical systems provide finite time guarantees for control in the presence of sequentially revealed cost functions. We pose the classical linear quadratic tracking problem in the framework of online optimization where the time-varying reference state is unknown a priori and is revealed after the applied control input. We show the equivalence of this problem to the control of linear systems subject to adversarial disturbances and propose a novel online gradient descent based algorithm to achieve efficient tracking in finite time. We provide a dynamic regret upper bound scaling linearly with the path length of the reference trajectory and a numerical example to corroborate the theoretical guarantees.
comment: Published at the IEEE Control Systems Letters
Fault Diagnosis and Prognosis Capabilities for Wind Turbine Hydraulic Pitch Systems
Wind energy is the leading non-hydro renewable technology. Increasing reliability is a key factor in reducing the downtime of high-power wind turbines installed in remote off-shore places, where maintenance is costly and less reactive. Defects in the pitch system are responsible for up to 20% of a wind turbine downtime.Thus, monitoring such defects is essential for avoiding it. This paper presents a generic assessment of the diagnosis capabilities in hydraulic pitch systems, which are used in high-power wind turbines. A mathematical model of the non-linear system dynamics is presented along with a description of the most frequent faults that occur. Structural analysis is used to assess which defects can be detected in the pitch system. The structural properties are furthermore explored to investigate the possibility of reducing the amount of sensors without compromising the fault diagnosis capabilities. Robustness to model uncertainty is finally addressed and generic principles for estimating the detectable magnitude of wear and tear are presented.
Deep DeePC: Data-enabled predictive control with low or no online optimization using deep learning
Data-enabled predictive control (DeePC) is a data-driven control algorithm that utilizes data matrices to form a non-parametric representation of the underlying system, predicting future behaviors and generating optimal control actions. DeePC typically requires solving an online optimization problem, the complexity of which is heavily influenced by the amount of data used, potentially leading to expensive online computation. In this paper, we leverage deep learning to propose a highly computationally efficient DeePC approach for general nonlinear processes, referred to as Deep DeePC. Specifically, a deep neural network is employed to learn the DeePC vector operator, which is an essential component of the non-parametric representation of DeePC. This neural network is trained offline using historical open-loop input and output data of the nonlinear process. With the trained neural network, the Deep DeePC framework is formed for online control implementation. At each sampling instant, this neural network directly outputs the DeePC operator, eliminating the need for online optimization as conventional DeePC. The optimal control action is obtained based on the DeePC operator updated by the trained neural network. To address constrained scenarios, a constraint handling scheme is further proposed and integrated with the Deep DeePC to handle hard constraints during online implementation. The efficacy and superiority of the proposed Deep DeePC approach are demonstrated using two benchmark process examples.
comment: 34 pages, 7 figures
Multi-Objective Learning Model Predictive Control
Multi-Objective Learning Model Predictive Control is a novel data-driven control scheme which improves a linear system's closed-loop performance with respect to several convex control objectives over iterations of a repeated task. At each task iteration, collected system data is used to construct terminal components of a Model Predictive Controller. The formulation presented in this paper ensures that closed-loop control performance improves between successive iterations with respect to each objective. We provide proofs of recursive feasibility and performance improvement, and show that the converged policy is Pareto optimal. Simulation results demonstrate the applicability of the proposed approach.
Learning a Stable, Safe, Distributed Feedback Controller for a Heterogeneous Platoon of Autonomous Vehicles
Platooning of autonomous vehicles has the potential to increase safety and fuel efficiency on highways. The goal of platooning is to have each vehicle drive at a specified speed (set by the leader) while maintaining a safe distance from its neighbors. Many prior works have analyzed various controllers for platooning, most commonly linear feedback and distributed model predictive controllers. In this work, we introduce an algorithm for learning a stable, safe, distributed controller for a heterogeneous platoon. Our algorithm relies on recent developments in learning neural network stability certificates. We train a controller for autonomous platooning in simulation and evaluate its performance on hardware with a platoon of four F1Tenth vehicles. We then perform further analysis in simulation with a platoon of 100 vehicles. Experimental results demonstrate the practicality of the algorithm and the learned controller by comparing the performance of the neural network controller to linear feedback and distributed model predictive controllers.
comment: Accepted to the International Symposium of Robotics Research (ISRR) 2024
Experiences with Sub-Arctic Sensor Network Deployment
This paper discusses the experiences gained from designing, deploying and maintaining low-power wireless sensor networks in three geothermally active remote locations in Iceland. The network was deployed to assist researchers in collecting soil temperature data which would help them investigate the impact of global warming on (sub)Arctic climate and subsequent carbon release. Functional networks with more than 50 sensor nodes from three sites with no direct access to power and the Internet have been providing researchers insight into the warming impacts since 2021. The network employs low-power primary cell-powered wireless sensor nodes equipped with DASH7 communication protocol and solar-powered DASH7-cellular gateways, providing real-time data and remote access to sensors and devices deployed in the field. We present a detailed discussion of different network components, their architecture, and the network's overall performance and reliability.
comment: 8 Figures, 6 pages
Robotics
In-Context Learning Enables Robot Action Prediction in LLMs
Recently, Large Language Models (LLMs) have achieved remarkable success using in-context learning (ICL) in the language domain. However, leveraging the ICL capabilities within LLMs to directly predict robot actions remains largely unexplored. In this paper, we introduce RoboPrompt, a framework that enables off-the-shelf text-only LLMs to directly predict robot actions through ICL without training. Our approach first heuristically identifies keyframes that capture important moments from an episode. Next, we extract end-effector actions from these keyframes as well as the estimated initial object poses, and both are converted into textual descriptions. Finally, we construct a structured template to form ICL demonstrations from these textual descriptions and a task instruction. This enables an LLM to directly predict robot actions at test time. Through extensive experiments and analysis, RoboPrompt shows stronger performance over zero-shot and ICL baselines in simulated and real-world settings.
Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions
Humanoid robots, with their human-like embodiment, have the potential to integrate seamlessly into human environments. Critical to their coexistence and cooperation with humans is the ability to understand natural language communications and exhibit human-like behaviors. This work focuses on generating diverse whole-body motions for humanoid robots from language descriptions. We leverage human motion priors from extensive human motion datasets to initialize humanoid motions and employ the commonsense reasoning capabilities of Vision Language Models (VLMs) to edit and refine these motions. Our approach demonstrates the capability to produce natural, expressive, and text-aligned humanoid motions, validated through both simulated and real-world experiments. More videos can be found at https://ut-austin-rpl.github.io/Harmon/.
comment: Accepted for oral presentation at 8th Annual Conference on Robot Learning. Project website: https://ut-austin-rpl.github.io/Harmon/
Physics-Informed Learning for the Friction Modeling of High-Ratio Harmonic Drives
This paper presents a scalable method for friction identification in robots equipped with electric motors and high-ratio harmonic drives, utilizing Physics-Informed Neural Networks (PINN). This approach eliminates the need for dedicated setups and joint torque sensors by leveraging the robo\v{t}s intrinsic model and state data. We present a comprehensive pipeline that includes data acquisition, preprocessing, ground truth generation, and model identification. The effectiveness of the PINN-based friction identification is validated through extensive testing on two different joints of the humanoid robot ergoCub, comparing its performance against traditional static friction models like the Coulomb-viscous and Stribeck-Coulomb-viscous models. Integrating the identified PINN-based friction models into a two-layer torque control architecture enhances real-time friction compensation. The results demonstrate significant improvements in control performance and reductions in energy losses, highlighting the scalability and robustness of the proposed method, also for application across a large number of joints as in the case of humanoid robots.
Non-Conservative Obstacle Avoidance for Multi-Body Systems Leveraging Convex Hulls and Predicted Closest Points
This paper introduces a novel approach that integrates future closest point predictions into the distance constraints of a collision avoidance controller, leveraging convex hulls with closest point distance calculations. By addressing abrupt shifts in closest points, this method effectively reduces collision risks and enhances controller performance. Applied to an Image Guided Therapy robot and validated through simulations and user experiments, the framework demonstrates improved distance prediction accuracy, smoother trajectories, and safer navigation near obstacles.
Hybrid Decision Making for Scalable Multi-Agent Navigation: Integrating Semantic Maps, Discrete Coordination, and Model Predictive Control
This paper presents a framework for multi-agent navigation in structured but dynamic environments, integrating three key components: a shared semantic map encoding metric and semantic environmental knowledge, a claim policy for coordinating access to areas within the environment, and a Model Predictive Controller for generating motion trajectories that respect environmental and coordination constraints. The main advantages of this approach include: (i) enforcing area occupancy constraints derived from specific task requirements; (ii) enhancing computational scalability by eliminating the need for collision avoidance constraints between robotic agents; and (iii) the ability to anticipate and avoid deadlocks between agents. The paper includes both simulations and physical experiments demonstrating the framework's effectiveness in various representative scenarios.
Faster Algorithms for Growing Collision-Free Convex Polytopes in Robot Configuration Space
We propose two novel algorithms for constructing convex collision-free polytopes in robot configuration space. Finding these polytopes enables the application of stronger motion-planning frameworks such as trajectory optimization with Graphs of Convex Sets [1] and is currently a major roadblock in the adoption of these approaches. In this paper, we build upon IRIS-NP (Iterative Regional Inflation by Semidefinite & Nonlinear Programming) [2] to significantly improve tunability, runtimes, and scaling to complex environments. IRIS-NP uses nonlinear programming paired with uniform random initialization to find configurations on the boundary of the free configuration space. Our key insight is that finding near-by configuration-space obstacles using sampling is inexpensive and greatly accelerates region generation. We propose two algorithms using such samples to either employ nonlinear programming more efficiently (IRIS-NP2 ) or circumvent it altogether using a massively-parallel zero-order optimization strategy (IRIS-ZO). We also propose a termination condition that controls the probability of exceeding a user-specified permissible fraction-in-collision, eliminating a significant source of tuning difficulty in IRIS-NP. We compare performance across eight robot environments, showing that IRIS-ZO achieves an order-of-magnitude speed advantage over IRIS-NP. IRISNP2, also significantly faster than IRIS-NP, builds larger polytopes using fewer hyperplanes, enabling faster downstream computation. Website: https://sites.google.com/view/fastiris
comment: 16 pages, 6 figures, accepted for publication in the proceedings of the International Symposium for Robotics Research 2024
Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for Autonomous Driving
The integration of Large Language Models (LLMs) into autonomous driving systems demonstrates strong common sense and reasoning abilities, effectively addressing the pitfalls of purely data-driven methods. Current LLM-based agents require lengthy inference times and face challenges in interacting with real-time autonomous driving environments. A key open question is whether we can effectively leverage the knowledge from LLMs to train an efficient and robust Reinforcement Learning (RL) agent. This paper introduces RAPID, a novel \underline{\textbf{R}}obust \underline{\textbf{A}}daptive \underline{\textbf{P}}olicy \underline{\textbf{I}}nfusion and \underline{\textbf{D}}istillation framework, which trains specialized mix-of-policy RL agents using data synthesized by an LLM-based driving agent and online adaptation. RAPID features three key designs: 1) utilization of offline data collected from an LLM agent to distil expert knowledge into RL policies for faster real-time inference; 2) introduction of robust distillation in RL to inherit both performance and robustness from LLM-based teacher; and 3) employment of a mix-of-policy approach for joint decision decoding with a policy adapter. Through fine-tuning via online environment interaction, RAPID reduces the forgetting of LLM knowledge while maintaining adaptability to different tasks. Extensive experiments demonstrate RAPID's capability to effectively integrate LLM knowledge into scaled-down RL policies in an efficient, adaptable, and robust way. Code and checkpoints will be made publicly available upon acceptance.
Leveraging Augmented Reality for Improved Situational Awareness During UAV-Driven Search and Rescue Missions
In the high-stakes domain of search-and-rescue missions, the deployment of Unmanned Aerial Vehicles (UAVs) has become increasingly pivotal. These missions require seamless, real-time communication among diverse roles within response teams, particularly between Remote Operators (ROs) and On-Site Operators (OSOs). Traditionally, ROs and OSOs have relied on radio communication to exchange critical information, such as the geolocation of victims, hazardous areas, and points of interest. However, radio communication lacks information visualization, suffers from noise, and requires mental effort to interpret information, leading to miscommunications and misunderstandings. To address these challenges, this paper presents VizCom-AR, an Augmented Reality system designed to facilitate visual communication between ROs and OSOs and their situational awareness during UAV-driven search-and-rescue missions. Our experiments, focus group sessions with police officers, and field study showed that VizCom-AR enhances spatial awareness of both ROs and OSOs, facilitate geolocation information exchange, and effectively complement existing communication tools in UAV-driven emergency response missions. Overall, VizCom-AR offers a fundamental framework for designing Augmented Reality systems for large scale UAV-driven rescue missions.
comment: 8 pages
Characterizing Behavioral Differences and Adaptations of Automated Vehicles and Human Drivers at Unsignalized Intersections: Insights from Waymo and Lyft Open Datasets
The integration of autonomous vehicles (AVs) into transportation systems presents an unprecedented opportunity to enhance road safety and efficiency. However, understanding the interactions between AVs and human-driven vehicles (HVs) at intersections remains an open research question. This study aims to bridge this gap by examining behavioral differences and adaptations of AVs and HVs at unsignalized intersections by utilizing two comprehensive AV datasets from Waymo and Lyft. Using a systematic methodology, the research identifies and analyzes merging and crossing conflicts by calculating key safety and efficiency metrics, including time to collision (TTC), post-encroachment time (PET), maximum required deceleration (MRD), time advantage (TA), and speed and acceleration profiles. The findings reveal a paradox in mixed traffic flow: while AVs maintain larger safety margins, their conservative behavior can lead to unexpected situations for human drivers, potentially causing unsafe conditions. From a performance point of view, human drivers exhibit more consistent behavior when interacting with AVs versus other HVs, suggesting AVs may contribute to harmonizing traffic flow patterns. Moreover, notable differences were observed between Waymo and Lyft vehicles, which highlights the importance of considering manufacturer-specific AV behaviors in traffic modeling and management strategies for the safe integration of AVs. The processed dataset utilized in this study is openly published to foster the research on AV-HV interactions.
comment: This work has been submitted to Transportation Research Record for potential publication
Stable Object Placement Planning From Contact Point Robustness
We introduce a planner designed to guide robot manipulators in stably placing objects within intricate scenes. Our proposed method reverses the traditional approach to object placement: our planner selects contact points first and then determines a placement pose that solicits the selected points. This is instead of sampling poses, identifying contact points, and evaluating pose quality. Our algorithm facilitates stability-aware object placement planning, imposing no restrictions on object shape, convexity, or mass density homogeneity, while avoiding combinatorial computational complexity. Our proposed stability heuristic enables our planner to find a solution about 20 times faster when compared to the same algorithm not making use of the heuristic and eight times faster than a state-of-the-art method that uses the traditional sample-and-evaluate approach. Our proposed planner is also more successful in finding stable placements than the five other benchmarked algorithms. Derived from first principles and validated in ten real robot experiments, our planner offers a general and scalable method to tackle the problem of object placement planning with rigid objects.
comment: Submitted to IEEE Transactions on Robotics. Contains 14 pages, 11 figures, and 3 tables
Imagine2Servo: Intelligent Visual Servoing with Diffusion-Driven Goal Generation for Robotic Tasks
Visual servoing, the method of controlling robot motion through feedback from visual sensors, has seen significant advancements with the integration of optical flow-based methods. However, its application remains limited by inherent challenges, such as the necessity for a target image at test time, the requirement of substantial overlap between initial and target images, and the reliance on feedback from a single camera. This paper introduces Imagine2Servo, an innovative approach leveraging diffusion-based image editing techniques to enhance visual servoing algorithms by generating intermediate goal images. This methodology allows for the extension of visual servoing applications beyond traditional constraints, enabling tasks like long-range navigation and manipulation without predefined goal images. We propose a pipeline that synthesizes subgoal images grounded in the task at hand, facilitating servoing in scenarios with minimal initial and target image overlap and integrating multi-camera feedback for comprehensive task execution. Our contributions demonstrate a novel application of image generation to robotic control, significantly broadening the capabilities of visual servoing systems. Real-world experiments validate the effectiveness and versatility of the Imagine2Servo framework in accomplishing a variety of tasks, marking a notable advancement in the field of visual servoing.
AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation
Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks throughout the growing season. Autonomous navigation under the canopy is challenging due to the degradation in accuracy of RTK-GPS and the large variability in the visual appearance of the scene over time. In prior work, we developed a supervised learning-based perception system with semantic keypoint representation and deployed this in various field conditions. A large number of failures of this system can be attributed to the inability of the perception model to adapt to the domain shift encountered during deployment. In this paper, we propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. Our preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using our method. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
Human-Inspired Long-Term Indoor Localization in Human-Oriented Environment IROS
Lifelong localization is crucial for enabling the autonomy of service robots. In this paper, we present an overview of our past research on long-term localization and mapping, exploiting geometric priors such as floor plans and integrating textual and semantic information. Our approach was validated on challenging sequences spanning over many months, and we released open source implementations.
comment: IROS Workshop paper
A Data-driven Contact Estimation Method for Wheeled-Biped Robots
Contact estimation is a key ability for limbed robots, where making and breaking contacts has a direct impact on state estimation and balance control. Existing approaches typically rely on gate-cycle priors or designated contact sensors. We design a contact estimator that is suitable for the emerging wheeled-biped robot types that do not have these features. To this end, we propose a Bayes filter in which update steps are learned from real-robot torque measurements while prediction steps rely on inertial measurements. We evaluate this approach in extensive real-robot and simulation experiments. Our method achieves better performance while being considerably more sample efficient than a comparable deep-learning baseline.
PAPL-SLAM: Principal Axis-Anchored Monocular Point-Line SLAM
In point-line SLAM systems, the utilization of line structural information and the optimization of lines are two significant problems. The former is usually addressed through structural regularities, while the latter typically involves using minimal parameter representations of lines in optimization. However, separating these two steps leads to the loss of constraint information to each other. We anchor lines with similar directions to a principal axis and optimize them with $n+2$ parameters for $n$ lines, solving both problems together. Our method considers scene structural information, which can be easily extended to different world hypotheses while significantly reducing the number of line parameters to be optimized, enabling rapid and accurate mapping and tracking. To further enhance the system's robustness and avoid mismatch, we have modeled the line-axis probabilistic data association and provided the algorithm for axis creation, updating, and optimization. Additionally, considering that most real-world scenes conform to the Atlanta World hypothesis, we provide a structural line detection strategy based on vertical priors and vanishing points. Experimental results and ablation studies on various indoor and outdoor datasets demonstrate the effectiveness of our system.
comment: 8 pages, 4 figures
A Robot Kinematics Model Estimation Using Inertial Sensors for On-Site Building Robotics
In order to make robots more useful in a variety of environments, they need to be highly portable so that they can be transported to wherever they are needed, and highly storable so that they can be stored when not in use. We propose "on-site robotics", which uses parts procured at the location where the robot will be active, and propose a new solution to the problem of portability and storability. In this paper, as a proof of concept for on-site robotics, we describe a method for estimating the kinematic model of a robot by using inertial measurement units (IMU) sensor module on rigid links, estimating the relative orientation between modules from angular velocity, and estimating the relative position from the measurement of centrifugal force. At the end of this paper, as an evaluation for this method, we present an experiment in which a robot made up of wooden sticks reaches a target position. In this experiment, even if the combination of the links is changed, the robot is able to reach the target position again immediately after estimation, showing that it can operate even after being reassembled. Our implementation is available on https://github.com/hiroya1224/urdf_estimation_with_imus .
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
3D Gaussian Splatting in Robotics: A Survey
Dense 3D representations of the environment have been a long-term goal in the robotics field. While previous Neural Radiance Fields (NeRF) representation have been prevalent for its implicit, coordinate-based model, the recent emergence of 3D Gaussian Splatting (3DGS) has demonstrated remarkable potential in its explicit radiance field representation. By leveraging 3D Gaussian primitives for explicit scene representation and enabling differentiable rendering, 3DGS has shown significant advantages over other radiance fields in real-time rendering and photo-realistic performance, which is beneficial for robotic applications. In this survey, we provide a comprehensive understanding of 3DGS in the field of robotics. We divide our discussion of the related works into two main categories: the application of 3DGS and the advancements in 3DGS techniques. In the application section, we explore how 3DGS has been utilized in various robotics tasks from scene understanding and interaction perspectives. The advance of 3DGS section focuses on the improvements of 3DGS own properties in its adaptability and efficiency, aiming to enhance its performance in robotics. We then summarize the most commonly used datasets and evaluation metrics in robotics. Finally, we identify the challenges and limitations of current 3DGS methods and discuss the future development of 3DGS in robotics.
Dual Action Policy for Robust Sim-to-Real Reinforcement Learning
This paper presents Dual Action Policy (DAP), a novel approach to address the dynamics mismatch inherent in the sim-to-real gap of reinforcement learning. DAP uses a single policy to predict two sets of actions: one for maximizing task rewards in simulation and another specifically for domain adaptation via reward adjustments. This decoupling makes it easier to maximize the overall reward in the source domain during training. Additionally, DAP incorporates uncertainty-based exploration during training to enhance agent robustness. Experimental results demonstrate DAP's effectiveness in bridging the sim-to-real gap, outperforming baselines on challenging tasks in simulation, and further improvement is achieved by incorporating uncertainty estimation.
Off-dynamics Conditional Diffusion Planners
Offline Reinforcement Learning (RL) offers an attractive alternative to interactive data acquisition by leveraging pre-existing datasets. However, its effectiveness hinges on the quantity and quality of the data samples. This work explores the use of more readily available, albeit off-dynamics datasets, to address the challenge of data scarcity in Offline RL. We propose a novel approach using conditional Diffusion Probabilistic Models (DPMs) to learn the joint distribution of the large-scale off-dynamics dataset and the limited target dataset. To enable the model to capture the underlying dynamics structure, we introduce two contexts for the conditional model: (1) a continuous dynamics score allows for partial overlap between trajectories from both datasets, providing the model with richer information; (2) an inverse-dynamics context guides the model to generate trajectories that adhere to the target environment's dynamic constraints. Empirical results demonstrate that our method significantly outperforms several strong baselines. Ablation studies further reveal the critical role of each dynamics context. Additionally, our model demonstrates that by modifying the context, we can interpolate between source and target dynamics, making it more robust to subtle shifts in the environment.
Fast Online Learning of CLiFF-maps in Changing Environments
Maps of dynamics are effective representations of motion patterns learned from prior observations, with recent research demonstrating their ability to enhance performance in various downstream tasks such as human-aware robot navigation, long-term human motion prediction, and robot localization. Current advancements have primarily concentrated on methods for learning maps of human flow in environments where the flow is static, i.e., not assumed to change over time. In this paper we propose a method to update the CLiFF-map, one type of map of dynamics, for achieving efficient life-long robot operation. As new observations are collected, our goal is to update a CLiFF-map to effectively and accurately integrate new observations, while retaining relevant historic motion patterns. The proposed online update method maintains a probabilistic representation in each observed location, updating parameters by continuously tracking sufficient statistics. In experiments using both synthetic and real-world datasets, we show that our method is able to maintain accurate representations of human motion dynamics, contributing to high performance flow-compliant planning downstream tasks, while being orders of magnitude faster than the comparable baselines.
Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning based Local Motion Planners
Deploying a safe mobile robot policy in scenarios with human pedestrians is challenging due to their unpredictable movements. Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements and could suffer from the over-fitting issue. Alternatively, framing the collision avoidance problem as a multi-agent framework, where agents generate dynamic movements while learning to reach their goals, can lead to conflicts with human pedestrians due to their homogeneity. To tackle this problem, we introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective. This diversity enriches each agent's experiences, improving its adaptability to unseen crowd behaviors. In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors. Our behavior-conditioned policies outperform existing works in these challenging scenes, reducing potential collisions without additional time or travel.
Learning Differentiable Tensegrity Dynamics using Graph Neural Networks
Tensegrity robots are composed of rigid struts and flexible cables. They constitute an emerging class of hybrid rigid-soft robotic systems and are promising systems for a wide array of applications, ranging from locomotion to assembly. They are difficult to control and model accurately, however, due to their compliance and high number of degrees of freedom. To address this issue, prior work has introduced a differentiable physics engine designed for tensegrity robots based on first principles. In contrast, this work proposes the use of graph neural networks to model contact dynamics over a graph representation of tensegrity robots, which leverages their natural graph-like cable connectivity between end caps of rigid rods. This learned simulator can accurately model 3-bar and 6-bar tensegrity robot dynamics in simulation-to-simulation experiments where MuJoCo is used as the ground truth. It can also achieve higher accuracy than the previous differentiable engine for a real 3-bar tensegrity robot, for which the robot state is only partially observable. When compared against direct applications of recent mesh-based graph neural network simulators, the proposed approach is computationally more efficient, both for training and inference, while achieving higher accuracy. Code and data are available at https://github.com/nchen9191/tensegrity_gnn_simulator_public
Vehicle Localization in GPS-Denied Scenarios Using Arc-Length-Based Map Matching
Automated driving systems face challenges in GPS-denied situations. To address this issue, kinematic dead reckoning is implemented using measurements from the steering angle, steering rate, yaw rate, and wheel speed sensors onboard the vehicle. However, dead reckoning methods suffer from drift. This paper provides an arc-length-based map matching method that uses a digital 2D map of the scenario in order to correct drift in the dead reckoning estimate. The kinematic model's prediction is used to introduce a temporal notion to the spatial information available in the map data. Results show reliable improvement in drift for all GPS-denied scenarios tested in this study. This innovative approach ensures that automated vehicles can maintain continuous and reliable navigation, significantly enhancing their safety and operational reliability in environments where GPS signals are compromised or unavailable.
Trajectory Manifold Optimization for Fast and Adaptive Kinodynamic Motion Planning
Fast kinodynamic motion planning is crucial for systems to effectively adapt to dynamically changing environments. Despite some efforts, existing approaches still struggle with rapid planning in high-dimensional, complex problems. Not surprisingly, the primary challenge arises from the high-dimensionality of the search space, specifically the trajectory space. We address this issue with a two-step method: initially, we identify a lower-dimensional trajectory manifold {\it offline}, comprising diverse trajectories specifically relevant to the task at hand while meeting kinodynamic constraints. Subsequently, we search for solutions within this manifold {\it online}, significantly enhancing the planning speed. To encode and generate a manifold of continuous-time, differentiable trajectories, we propose a novel neural network model, {\it Differentiable Motion Manifold Primitives (DMMP)}, along with a practical training strategy. Experiments with a 7-DoF robot arm tasked with dynamic throwing to arbitrary target positions demonstrate that our method surpasses existing approaches in planning speed, task success, and constraint satisfaction.
comment: 12 pages, 11 figures
The State of Robot Motion Generation
This paper reviews the large spectrum of methods for generating robot motion proposed over the 50 years of robotics research culminating in recent developments. It crosses the boundaries of methodologies, typically not surveyed together, from those that operate over explicit models to those that learn implicit ones. The paper discusses the current state-of-the-art as well as properties of varying methodologies, highlighting opportunities for integration.
comment: To be presented at the International Symposium of Robotics Research (ISRR), 2024
Towards Autonomous Indoor Parking: A Globally Consistent Semantic SLAM System and A Semantic Localization Subsystem
We propose a globally consistent semantic SLAM system (GCSLAM) and a semantic-fusion localization subsystem (SF-Loc), which achieves accurate semantic mapping and robust localization in complex parking lots. Visual cameras (front-view and surround-view), IMU, and wheel encoder form the input sensor configuration of our system. The first part of our work is GCSLAM. GCSLAM introduces a novel factor graph for the optimization of poses and semantic map, which incorporates innovative error terms based on multi-sensor data and BEV (bird's-eye view) semantic information. Additionally, GCSLAM integrates a Global Slot Management module that stores and manages parking slot observations. SF-Loc is the second part of our work, which leverages the semantic map built by GCSLAM to conduct map-based localization. SF-Loc integrates registration results and odometry poses with a novel factor graph. Our system demonstrates superior performance over existing SLAM on two real-world datasets, showing excellent capabilities in robust global localization and precise semantic mapping.
Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control
Empowering resource-limited robots to execute computationally intensive tasks like model/learning-based algorithms is challenging. Due to the complexity of the workload characteristic, the bottlenecks in different systems can depend on application requirements, preventing a single hardware architecture from being adequate across all robotics applications. This project provides a comprehensive design space exploration to determine optimal hardware computation platforms and architectures suitable for robotic algorithms. We profile and optimize representative architectural designs across general-purpose cores and specialized accelerators. Specifically, we compare CPUs, vector machines, and domain-specialized accelerators with kernel-level benchmarks and end-to-end representative robotic workloads. Our exploration provides a quantitative performance, area, and utilization comparison and analyzes the trade-offs between these representative distinct architectural designs. We demonstrate that the variation of hardware architecture choices depends on workload characteristics and application requirements. Finally, we explore how architectural modifications and software ecosystem optimization can alleviate bottlenecks and enhance utilization.
Sample-Efficient Reinforcement Learning with Temporal Logic Objectives: Leveraging the Task Specification to Guide Exploration
This paper addresses the problem of learning optimal control policies for systems with uncertain dynamics and high-level control objectives specified as Linear Temporal Logic (LTL) formulas. Uncertainty is considered in the workspace structure and the outcomes of control decisions giving rise to an unknown Markov Decision Process (MDP). Existing reinforcement learning (RL) algorithms for LTL tasks typically rely on exploring a product MDP state-space uniformly (using e.g., an $\epsilon$-greedy policy) compromising sample-efficiency. This issue becomes more pronounced as the rewards get sparser and the MDP size or the task complexity increase. In this paper, we propose an accelerated RL algorithm that can learn control policies significantly faster than competitive approaches. Its sample-efficiency relies on a novel task-driven exploration strategy that biases exploration towards directions that may contribute to task satisfaction. We provide theoretical analysis and extensive comparative experiments demonstrating the sample-efficiency of the proposed method. The benefit of our method becomes more evident as the task complexity or the MDP size increases.
comment: arXiv admin note: text overlap with arXiv:2205.04424
GyroCopter: Differential Bearing Measuring Trajectory Planner for Tracking and Localizing Radio Frequency Sources
Autonomous aerial vehicles can provide efficient and effective solutions for radio frequency (RF) source tracking and localizing problems with applications ranging from wildlife conservation to search and rescue operations. Existing lightweight, low-cost, bearing measurements-based methods with a single antenna-receiver sensor system configurations necessitate in situ rotations, leading to substantial measurement acquisition times restricting searchable areas and number of measurements. We propose a GyroCopter for the task. Our approach plans the trajectory of a multi-rotor unmanned aerial vehicle (UAV) whilst utilizing UAV flight dynamics to execute a constant gyration motion to derive "pseudo-bearing" measurements to track RF sources. The gyration-based pseudo-bearing approach: i) significantly reduces the limitations associated with in situ rotation bearing; while ii) capitalizing on the simplicity, affordability, and lightweight nature of signal strength measurement acquisition hardware to estimate bearings. This method distinguishes itself from other pseudo-bearing approaches by eliminating the need for additional hardware to maintain simplicity, lightweightness and cost-effectiveness. To validate our approach, we derived the optimal rotation speed and conducted extensive simulations and field missions with our GyroCopter to track and localize multiple RF sources. The results confirm the effectiveness of our method, highlighting its potential as a practical and rapid solution for RF source localization tasks.
comment: For a demonstration video, see https://youtu.be/OkmmQjD74Us
Anisotropic Stiffness and Programmable Actuation for Soft Robots Enabled by an Inflated Rotational Joint
Soft robots are known for their ability to perform tasks with great adaptability, enabled by their distributed, non-uniform stiffness and actuation. Bending is the most fundamental motion for soft robot design, but creating robust, and easy-to-fabricate soft bending joint with tunable properties remains an active problem of research. In this work, we demonstrate an inflatable actuation module for soft robots with a defined bending plane enabled by forced partial wrinkling. This lowers the structural stiffness in the bending direction, with the final stiffness easily designed by the ratio of wrinkled and unwrinkled regions. We present models and experimental characterization showing the stiffness properties of the actuation module, as well as its ability to maintain the kinematic constraint over a large range of loading conditions. We demonstrate the potential for complex actuation in a soft continuum robot and for decoupling actuation force and efficiency from load capacity. The module provides a novel method for embedding intelligent actuation into soft pneumatic robots.
Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models
End-to-end learning directly maps sensory inputs to actions, creating highly integrated and efficient policies for complex robotics tasks. However, such models are tricky to efficiently train and often struggle to generalize beyond their training scenarios, limiting adaptability to new environments, tasks, and concepts. In this work, we investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies under unseen text instructions and visual distribution shifts. To this end, we design datasets with various levels of data representation richness, refine feature extraction protocols by leveraging multi-modal foundation model encoders, and assess the suitability of different policy network heads. Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors, generating spatially aware embeddings that integrate semantic and visual information. These rich features form the basis for training highly robust downstream policies capable of generalizing across platforms, environments, and text-specified tasks. We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning on a small simulated dataset successfully generalize to real-world scenes, handling diverse novel goals and command formulations.
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
This paper presents a method for generating large-scale datasets to improve class-agnostic video segmentation across robots with different form factors. Specifically, we consider the question of whether video segmentation models trained on generic segmentation data could be more effective for particular robot platforms if robot embodiment is factored into the data generation process. To answer this question, a pipeline is formulated for using 3D reconstructions (e.g. from HM3DSem) to generate segmented videos that are configurable based on a robot's embodiment (e.g. sensor type, sensor placement, and illumination source). A resulting massive RGB-D video panoptic segmentation dataset (MVPd) is introduced for extensive benchmarking with foundation and video segmentation models, as well as to support embodiment-focused research in video segmentation. Our experimental findings demonstrate that using MVPd for finetuning can lead to performance improvements when transferring foundation models to certain robot embodiments, such as specific camera placements. These experiments also show that using 3D modalities (depth images and camera pose) can lead to improvements in video segmentation accuracy and consistency. The project webpage is available at https://topipari.com/projects/MVPd
comment: Accepted in IEEE Robotics and Automation Letters October 2024
Risk Assessment for Autonomous Landing in Urban Environments using Semantic Segmentation
In this paper, we address the vision-based autonomous landing problem in complex urban environments using deep neural networks for semantic segmentation and risk assessment. We propose employing the SegFormer, a state-of-the-art visual transformer network, for the semantic segmentation of complex, unstructured urban environments. This approach yields valuable information that can be utilized in smart autonomous landing missions, particularly in emergency landing scenarios resulting from system failures or human errors. The assessment is done in real-time flight, when images of an RGB camera at the Unmanned Aerial Vehicle (UAV) are segmented with the SegFormer into the most common classes found in urban environments. These classes are then mapped into a level of risk, considering in general, potential material damage, damaging the drone itself and endanger people. The proposed strategy is validated through several case studies, demonstrating the huge potential of semantic segmentation-based strategies to determining the safest landing areas for autonomous emergency landing, which we believe will help unleash the full potential of UAVs on civil applications within urban areas.
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Real-time aerial image segmentation plays an important role in the environmental perception of Uncrewed Aerial Vehicles (UAVs). We introduce BlabberSeg, an optimized Vision-Language Model built on CLIPSeg for on-board, real-time processing of aerial images by UAVs. BlabberSeg improves the efficiency of CLIPSeg by reusing prompt and model features, reducing computational overhead while achieving real-time open-vocabulary aerial segmentation. We validated BlabberSeg in a safe landing scenario using the Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI) framework, which uses visual servoing and open-vocabulary segmentation. BlabberSeg reduces computational costs significantly, with a speed increase of 927.41% (16.78 Hz) on a NVIDIA Jetson Orin AGX (64GB) compared with the original CLIPSeg (1.81Hz), achieving real-time aerial segmentation with negligible loss in accuracy (2.1% as the ratio of the correctly segmented area with respect to CLIPSeg). BlabberSeg's source code is open and available online.
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
The ability to reflect on and correct failures is crucial for robotic systems to interact stably with real-life objects. Observing the generalization and reasoning capabilities of Multimodal Large Language Models (MLLMs), previous approaches have aimed to utilize these models to enhance robotic systems accordingly. However, these methods typically focus on high-level planning corrections using an additional MLLM, with limited utilization of failed samples to correct low-level contact poses which is particularly prone to occur during articulated object manipulation. To address this gap, we propose an Autonomous Interactive Correction (AIC) MLLM, which makes use of previous low-level interaction experiences to correct SE(3) pose predictions for articulated object. Specifically, AIC MLLM is initially fine-tuned to acquire both pose prediction and feedback prompt comprehension abilities. We design two types of prompt instructions for interactions with objects: 1) visual masks to highlight unmovable parts for position correction, and 2) textual descriptions to indicate potential directions for rotation correction. During inference, a Feedback Information Extraction module is introduced to recognize the failure cause, allowing AIC MLLM to adaptively correct the pose prediction using the corresponding prompts. To further enhance manipulation stability, we devise a Test Time Adaptation strategy that enables AIC MLLM to better adapt to the current scene configuration. Finally, extensive experiments are conducted in both simulated and real-world environments to evaluate the proposed method. The results demonstrate that our AIC MLLM can efficiently correct failure samples by leveraging interaction experience prompts. Our project website is https://sites.google.com/view/aic-mllm.
Visual Manipulation with Legs
Animals use limbs for both locomotion and manipulation. We aim to equip quadruped robots with similar versatility. This work introduces a system that enables quadruped robots to interact with objects using their legs, inspired by non-prehensile manipulation. The system has two main components: a visual manipulation policy module and a loco-manipulator module. The visual manipulation policy, trained with reinforcement learning (RL) using point cloud observations and object-centric actions, decides how the leg should interact with the object. The loco-manipulator controller manages leg movements and body pose adjustments, based on impedance control and Model Predictive Control (MPC). Besides manipulating objects with a single leg, the system can select from the left or right leg based on critic maps and move objects to distant goals through base adjustment. Experiments evaluate the system on object pose alignment tasks in both simulation and the real world, demonstrating more versatile object manipulation skills with legs than previous work. Videos can be found at https://legged-manipulation.github.io/
comment: More details can be found on our project page: https://legged-manipulation.github.io/
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024
Language-guided robotic manipulation is a challenging task that requires an embodied agent to follow abstract user instructions to accomplish various complex manipulation tasks. Previous work trivially fitting the data without revealing the relation between instruction and low-level executable actions, these models are prone to memorizing the surficial pattern of the data instead of acquiring the transferable knowledge, and thus are fragile to dynamic environment changes. To address this issue, we propose a PrIrmitive-driVen waypOinT-aware world model for Robotic manipulation (PIVOT-R) that focuses solely on the prediction of task-relevant waypoints. Specifically, PIVOT-R consists of a Waypoint-aware World Model (WAWM) and a lightweight action prediction module. The former performs primitive action parsing and primitive-driven waypoint prediction, while the latter focuses on decoding low-level actions. Additionally, we also design an asynchronous hierarchical executor (AHE), which can use different execution frequencies for different modules of the model, thereby helping the model reduce computational redundancy and improve model execution efficiency. Our PIVOT-R outperforms state-of-the-art (SoTA) open-source models on the SeaWave benchmark, achieving an average relative improvement of 19.45% across four levels of instruction tasks. Moreover, compared to the synchronously executed PIVOT-R, the execution efficiency of PIVOT-R with AHE is increased by 28-fold, with only a 2.9% drop in performance. These results provide compelling evidence that our PIVOT-R can significantly improve both the performance and efficiency of robotic manipulation.
comment: Accepted to NeurIPS 2024
One-Shot Imitation under Mismatched Execution
Human demonstrations as prompts are a powerful way to program robots to do long-horizon manipulation tasks. However, translating these demonstrations into robot-executable actions presents significant challenges due to execution mismatches in movement styles and physical capabilities. Existing methods either depend on human-robot paired data, which is infeasible to scale, or rely heavily on frame-level visual similarities that often break down in practice. To address these challenges, we propose RHyME, a novel framework that automatically aligns human and robot task executions using optimal transport costs. Given long-horizon robot demonstrations, RHyME synthesizes semantically equivalent human videos by retrieving and composing short-horizon human clips. This approach facilitates effective policy training without the need for paired data. RHyME successfully imitates a range of cross-embodiment demonstrators, both in simulation and with a real human hand, achieving over 50\% increase in task success compared to previous methods. We release our datasets and graphics at this https://portal.cs.cornell.edu/rhyme/.
LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs NeurIPS 2024
Robot swarms are composed of many simple robots that communicate and collaborate to fulfill complex tasks. Robot controllers usually need to be specified by experts on a case-by-case basis via programming code. This process is time-consuming, prone to errors, and unable to take into account all situations that may be encountered during deployment. On the other hand, recent Large Language Models (LLMs) have demonstrated reasoning and planning capabilities, introduced new ways to interact with and program machines, and incorporate both domain-specific and commonsense knowledge. Hence, we propose to address the aforementioned challenges by integrating LLMs with robot swarms and show the potential in proofs of concept (showcases). For this integration, we explore two approaches. The first approach is 'indirect integration,' where LLMs are used to synthesize and validate the robot controllers. This approach may reduce development time and human error before deployment. Moreover, during deployment, it could be used for on-the-fly creation of new robot behaviors. The second approach is 'direct integration,' where each robot locally executes a separate LLM instance during deployment for robot-robot collaboration and human-swarm interaction. These local LLM instances enable each robot to reason, plan, and collaborate using natural language, as demonstrated in our showcases where the robots are able to detect a variety of anomalies, without prior information about the nature of these anomalies. To enable further research on our mainly conceptual contribution, we release the software and videos for our LLM2Swarm system: https://github.com/Pold87/LLM2Swarm.
comment: Accepted at NeurIPS 2024 Workshop on Open-World Agents
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained Policies
Reinforcement learning combined with sim-to-real transfer offers a general framework for developing locomotion controllers for legged robots. To facilitate successful deployment in the real world, smoothing techniques, such as low-pass filters and smoothness rewards, are often employed to develop policies with smooth behaviors. However, because these techniques are non-differentiable and usually require tedious tuning of a large set of hyperparameters, they tend to require extensive manual tuning for each robotic platform. To address this challenge and establish a general technique for enforcing smooth behaviors, we propose a simple and effective method that imposes a Lipschitz constraint on a learned policy, which we refer to as Lipschitz-Constrained Policies (LCP). We show that the Lipschitz constraint can be implemented in the form of a gradient penalty, which provides a differentiable objective that can be easily incorporated with automatic differentiation frameworks. We demonstrate that LCP effectively replaces the need for smoothing rewards or low-pass filters and can be easily integrated into training frameworks for many distinct humanoid robots. We extensively evaluate LCP in both simulation and real-world humanoid robots, producing smooth and robust locomotion controllers. All simulation and deployment code, along with complete checkpoints, is available on our project page: https://lipschitz-constrained-policy.github.io.
comment: 8 pages
Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning ICANN24
The state of an object reflects its current status or condition and is important for a robot's task planning and manipulation. However, detecting an object's state and generating a state-sensitive plan for robots is challenging. Recently, pre-trained Large Language Models (LLMs) and Vision-Language Models (VLMs) have shown impressive capabilities in generating plans. However, to the best of our knowledge, there is hardly any investigation on whether LLMs or VLMs can also generate object state-sensitive plans. To study this, we introduce an Object State-Sensitive Agent (OSSA), a task-planning agent empowered by pre-trained neural networks. We propose two methods for OSSA: (i) a modular model consisting of a pre-trained vision processing module (dense captioning model, DCM) and a natural language processing model (LLM), and (ii) a monolithic model consisting only of a VLM. To quantitatively evaluate the performances of the two methods, we use tabletop scenarios where the task is to clear the table. We contribute a multimodal benchmark dataset that takes object states into consideration. Our results show that both methods can be used for object state-sensitive tasks, but the monolithic approach outperforms the modular approach. The code for OSSA is available at https://github.com/Xiao-wen-Sun/OSSA
comment: ICANN24, Switzerland
NAR-*ICP: Neural Execution of Classical ICP-based Pointcloud Registration Algorithms
This study explores the intersection of neural networks and classical robotics algorithms through the Neural Algorithmic Reasoning (NAR) framework, allowing to train neural networks to effectively reason like classical robotics algorithms by learning to execute them. Algorithms are integral to robotics and safety-critical applications due to their predictable and consistent performance through logical and mathematical principles. In contrast, while neural networks are highly adaptable, handling complex, high-dimensional data and generalising across tasks, they often lack interpretability and transparency in their internal computations. We propose a Graph Neural Network (GNN)-based learning framework, NAR-*ICP, which learns the intermediate algorithmic steps of classical ICP-based pointcloud registration algorithms, and extend the CLRS Algorithmic Reasoning Benchmark with classical robotics perception algorithms. We evaluate our approach across diverse datasets, from real-world to synthetic, demonstrating its flexibility in handling complex and noisy inputs, along with its potential to be used as part of a larger learning system. Our results indicate that our method achieves superior performance across all benchmarks and datasets, consistently surpassing even the algorithms it has been trained on, further demonstrating its ability to generalise beyond the capabilities of traditional algorithms.
comment: 17 pages, 9 figures
CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications
Autonomous robot operation in unstructured environments is often underpinned by spatial understanding through vision. Systems composed of multiple concurrently operating robots additionally require access to frequent, accurate and reliable pose estimates. In this work, we propose CoViS-Net, a decentralized visual spatial foundation model that learns spatial priors from data, enabling pose estimation as well as spatial comprehension. Our model is fully decentralized, platform-agnostic, executable in real-time using onboard compute, and does not require existing networking infrastructure. CoViS-Net provides relative pose estimates and a local bird's-eye-view (BEV) representation, even without camera overlap between robots (in contrast to classical methods). We demonstrate its use in a multi-robot formation control task across various real-world settings. We provide code, models and supplementary material online. https://proroklab.github.io/CoViS-Net/
Vector Field-Guided Learning Predictive Control for Motion Planning of Mobile Robots with Uncertain Dynamics
In obstacle-dense scenarios, providing safe guidance for mobile robots is critical to improve the safe maneuvering capability. However, the guidance provided by standard guiding vector fields (GVFs) may limit the motion capability due to the improper curvature of the integral curve when traversing obstacles. On the other hand, robotic system dynamics are often time-varying, uncertain, and even unknown during the motion planning process. Therefore, many existing kinodynamic motion planning methods could not achieve satisfactory reliability in guaranteeing safety. To address these challenges, we propose a two-level Vector Field-guided Learning Predictive Control (VF-LPC) approach that improves safe maneuverability. The first level, the guiding level, generates safe desired trajectories using the designed kinodynamic GVF, enabling safe motion in obstacle-dense environments. The second level, the Integrated Motion Planning and Control (IMPC) level, first uses a deep Koopman operator to learn a nominal dynamics model offline and then updates the model uncertainties online using sparse Gaussian processes (GPs). The learned dynamics and a game-based safe barrier function are then incorporated into the LPC framework to generate near-optimal planning solutions. Extensive simulations and real-world experiments were conducted on quadrotor unmanned aerial vehicles and unmanned ground vehicles, demonstrating that VF-LPC enables robots to maneuver safely.
An efficient strategy for path planning with a tethered marsupial robotics system
A tethered marsupial robotics system comprises three components: an Unmanned Ground Vehicle (UGV), an Unmanned Aerial Vehicle (UAV), and a tether connecting both robots. Marsupial systems are highly beneficial in industry as they extend the UAV's battery life during flight. This paper introduces a novel strategy for a specific path planning problem in marsupial systems, where each of the three components must avoid collisions with ground and aerial obstacles modeled as 3D cuboids. Given an initial configuration in which the UAV is positioned atop the UGV, the goal is to reach an aerial target with the UAV. We assume that the UGV first moves to a position from which the UAV can take off and fly through a vertical plane to reach an aerial target. We propose an approach that discretizes the space to approximate an optimal solution, minimizing the sum of the lengths of the ground and air paths. First, we assume a taut tether and use a novel algorithm that leverages the convexity of the tether and the geometry of obstacles to efficiently determine the locus of feasible take-off points for the UAV. We then apply this result to scenarios that involve loose tethers. The simulation test results show that our approach can solve complex situations in seconds, outperforming a baseline planning algorithm based on RRT* (Rapidly exploring Random Trees).
comment: 25 pages, 9 figures, 3 tables
Know your limits! Optimize the robot's behavior through self-awareness
As humanoid robots transition from labs to real-world environments, it is essential to democratize robot control for non-expert users. Recent human-robot imitation algorithms focus on following a reference human motion with high precision, but they are susceptible to the quality of the reference motion and require the human operator to simplify its movements to match the robot's capabilities. Instead, we consider that the robot should understand and adapt the reference motion to its own abilities, facilitating the operator's task. For that, we introduce a deep-learning model that anticipates the robot's performance when imitating a given reference. Then, our system can generate multiple references given a high-level task command, assign a score to each of them, and select the best reference to achieve the desired robot behavior. Our Self-AWare model (SAW) ranks potential robot behaviors based on various criteria, such as fall likelihood, adherence to the reference motion, and smoothness. We integrate advanced motion generation, robot control, and SAW in one unique system, ensuring optimal robot behavior for any task command. For instance, SAW can anticipate falls with 99.29% accuracy. For more information check our project page: https://evm7.github.io/Self-AWare
comment: Accepted to Humanoids 2024 and HFR 2024. Project Page: https://evm7.github.io/Self-AWare
Instruction-Guided Visual Masking NeurIPS 2024
Instruction following is crucial in contemporary LLM. However, when extended to multimodal setting, it often suffers from misalignment between specific textual instruction and targeted local region of an image. To achieve more accurate and nuanced multimodal instruction following, we introduce Instruction-guided Visual Masking (IVM), a new versatile visual grounding model that is compatible with diverse multimodal models, such as LMM and robot model. By constructing visual masks for instruction-irrelevant regions, IVM-enhanced multimodal models can effectively focus on task-relevant image regions to better align with complex instructions. Specifically, we design a visual masking data generation pipeline and create an IVM-Mix-1M dataset with 1 million image-instruction pairs. We further introduce a new learning technique, Discriminator Weighted Supervised Learning (DWSL) for preferential IVM training that prioritizes high-quality data samples. Experimental results on generic multimodal tasks such as VQA and embodied robotic control demonstrate the versatility of IVM, which as a plug-and-play tool, significantly boosts the performance of diverse multimodal models, yielding new state-of-the-art results across challenging multimodal benchmarks. Code, model and data are available at https://github.com/2toinf/IVM.
comment: NeurIPS 2024
InterACT: Inter-dependency Aware Action Chunking with Hierarchical Attention Transformers for Bimanual Manipulation
Bimanual manipulation presents unique challenges compared to unimanual tasks due to the complexity of coordinating two robotic arms. In this paper, we introduce InterACT: Inter-dependency aware Action Chunking with Hierarchical Attention Transformers, a novel imitation learning framework designed specifically for bimanual manipulation. InterACT leverages hierarchical attention mechanisms to effectively capture inter-dependencies between dual-arm joint states and visual inputs. The framework comprises a Hierarchical Attention Encoder, which processes multi-modal inputs through segment-wise and cross-segment attention mechanisms, and a Multi-arm Decoder that generates each arm's action predictions in parallel, while sharing information between the arms through synchronization blocks by providing the other arm's intermediate output as context. Our experiments, conducted on various simulated and real-world bimanual manipulation tasks, demonstrate that InterACT outperforms existing methods. Detailed ablation studies further validate the significance of key components, including the impact of CLS tokens, cross-segment encoders, and synchronization blocks on task performance. We provide supplementary materials and videos on our project page.
comment: Accepted at Conference on Robot Learning (CoRL) 2024
Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation NeurIPS 2024
Despite significant progress in robotics and embodied AI in recent years, deploying robots for long-horizon tasks remains a great challenge. Majority of prior arts adhere to an open-loop philosophy and lack real-time feedback, leading to error accumulation and undesirable robustness. A handful of approaches have endeavored to establish feedback mechanisms leveraging pixel-level differences or pre-trained visual representations, yet their efficacy and adaptability have been found to be constrained. Inspired by classic closed-loop control systems, we propose CLOVER, a closed-loop visuomotor control framework that incorporates feedback mechanisms to improve adaptive robotic control. CLOVER consists of a text-conditioned video diffusion model for generating visual plans as reference inputs, a measurable embedding space for accurate error quantification, and a feedback-driven controller that refines actions from feedback and initiates replans as needed. Our framework exhibits notable advancement in real-world robotic tasks and achieves state-of-the-art on CALVIN benchmark, improving by 8% over previous open-loop counterparts. Code and checkpoints are maintained at https://github.com/OpenDriveLab/CLOVER.
comment: Accepted at NeurIPS 2024. Code and models: https://github.com/OpenDriveLab/CLOVER
"Set It Up!": Functional Object Arrangement with Compositional Generative Models
This paper studies the challenge of developing robots capable of understanding under-specified instructions for creating functional object arrangements, such as "set up a dining table for two"; previous arrangement approaches have focused on much more explicit instructions, such as "put object A on the table." We introduce a framework, SetItUp, for learning to interpret under-specified instructions. SetItUp takes a small number of training examples and a human-crafted program sketch to uncover arrangement rules for specific scene types. By leveraging an intermediate graph-like representation of abstract spatial relationships among objects, SetItUp decomposes the arrangement problem into two subproblems: i) learning the arrangement patterns from limited data and ii) grounding these abstract relationships into object poses. SetItUp leverages large language models (LLMs) to propose the abstract spatial relationships among objects in novel scenes as the constraints to be satisfied; then, it composes a library of diffusion models associated with these abstract relationships to find object poses that satisfy the constraints. We validate our framework on a dataset comprising study desks, dining tables, and coffee tables, with the results showing superior performance in generating physically plausible, functional, and aesthetically pleasing object arrangements compared to existing models.
comment: 10 pages main paper, 21 pages appendix, RSS 2024
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair
Surface cracks in infrastructure can lead to significant deterioration and costly maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise and thus difficult to scale to large areas. While advancements in robotic perception and manipulation have progressed autonomous crack repair, existing methods still face three key challenges: accurate localization of cracks within the robot's coordinate frame, (ii) adaptability to varying crack depths and widths, and (iii) validation of the repair process under realistic conditions. This paper presents an adaptive, autonomous system for surface crack detection and repair using robotics with advanced sensing technologies to enhance precision and safety for humans. The system uses an RGB-D camera for crack detection, a laser scanner for precise measurement, and an extruder and pump for material deposition. To address one of the key challenges, the laser scanner is used to enhance the crack coordinates for accurate localization. Furthermore, our approach demonstrates that an adaptive crack-filling method is more efficient and effective than a fixed-speed approach, with experimental results confirming both precision and consistency. In addition, to ensure real-world applicability and testing repeatability, we introduce a novel validation procedure using 3D-printed crack specimens that accurately simulate real-world conditions. This research contributes to the evolving field of human-robot interaction in construction by demonstrating how adaptive robotic systems can reduce the need for manual labor, improve safety, and enhance the efficiency of maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 22 pages, 14 figures, submitted to Advanced Engineering Informatics
ECoDe: A Sample-Efficient Method for Co-Design of Robotic Agents
Co-designing autonomous robotic agents involves simultaneously optimizing the controller and physical design of the agent. Its inherent bi-level optimization formulation necessitates an outer loop design optimization driven by an inner loop control optimization. This can be challenging when the design space is large and each design evaluation involves a data-intensive reinforcement learning process for control optimization. To improve the sample efficiency of co-design, we propose a multi-fidelity-based exploration strategy in which we tie the controllers learned across the design spaces through a universal policy learner for warm-starting subsequent controller learning problems. Experiments performed on a wide range of agent design problems demonstrate the superiority of our method compared to baselines. Additionally, analysis of the optimized designs shows interesting design alterations, including design simplifications and non-intuitive alterations.
comment: 17 pages, 10 figures
Learning to Control and Coordinate Mixed Traffic Through Robot Vehicles at Complex and Unsignalized Intersections
Intersections are essential road infrastructures for traffic in modern metropolises. However, they can also be the bottleneck of traffic flows as a result of traffic incidents or the absence of traffic coordination mechanisms such as traffic lights. Recently, various control and coordination mechanisms that are beyond traditional control methods have been proposed to improve the efficiency of intersection traffic. Amongst these methods, the control of foreseeable mixed traffic that consists of human-driven vehicles (HVs) and robot vehicles (RVs) has emerged. In this project, we propose a decentralized multi-agent reinforcement learning approach for the control and coordination of mixed traffic at real-world, complex intersections--a topic that has not been previously explored. Comprehensive experiments are conducted to show the effectiveness of our approach. In particular, we show that using 5% RVs, we can prevent congestion formation inside a complex intersection under the actual traffic demand of 700 vehicles per hour. In contrast, without RVs, congestion starts to develop when the traffic demand reaches as low as 200 vehicles per hour. When there exist more than 60% RVs in traffic, our method starts to achieve comparable or even better performance to traffic signals on the average waiting time of all vehicles at the intersection. Our method is also robust against both blackout events and sudden RV percentage drops, and enjoys excellent generalizablility, which is illustrated by its successful deployment in two unseen intersections.
comment: This paper introduces the first method to control and coordinate mixed traffic (i.e., human-driven vehicles and robot vehicles) at unsignalized intersections with both complicated topology and real-world traffic demands. The International Journal of Robotics Research. 2024;0(0)
RPCBF: Constructing Safety Filters Robust to Model Error and Disturbances via Policy Control Barrier Functions ICRA 2025
Control Barrier Functions (CBFs) have proven to be an effective tool for performing safe control synthesis for nonlinear systems. However, guaranteeing safety in the presence of disturbances and input constraints for high relative degree systems is a difficult problem. In this work, we propose the Robust Policy CBF (RPCBF), a practical method of constructing CBF approximations that is easy to implement and robust to disturbances via the estimation of a value function. We demonstrate the effectiveness of our method in simulation on a variety of high relative degree input-constrained systems. Finally, we demonstrate the benefits of RPCBF in compensating for model errors on a hardware quadcopter platform by treating the model errors as disturbances. The project page can be found at https://oswinso.xyz/rpcbf.
comment: Submitted to ICRA 2025. The project page can be found at https://oswinso.xyz/rpcbf
Comprehensive Robotic Cholecystectomy Dataset (CRCD): Integrating Kinematics, Pedal Signals, and Endoscopic Videos
In recent years, the potential applications of machine learning to Minimally Invasive Surgery (MIS) have spurred interest in data sets that can be used to develop data-driven tools. This paper introduces a novel dataset recorded during ex vivo pseudo-cholecystectomy procedures on pig livers, utilizing the da Vinci Research Kit (dVRK). Unlike current datasets, ours bridges a critical gap by offering not only full kinematic data but also capturing all pedal inputs used during the procedure and providing a time-stamped record of the endoscope's movements. Contributed by seven surgeons, this data set introduces a new dimension to surgical robotics research, allowing the creation of advanced models for automating console functionalities. Our work addresses the existing limitation of incomplete recordings and imprecise kinematic data, common in other datasets. By introducing two models, dedicated to predicting clutch usage and camera activation, we highlight the dataset's potential for advancing automation in surgical robotics. The comparison of methodologies and time windows provides insights into the models' boundaries and limitations.
comment: 6 pages, 8 figures, 5 tables. Accepted for presentation at the 2024 International Symposium on Medical Robotics
D$^3$Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement
Scene representation is a crucial design choice in robotic manipulation systems. An ideal representation is expected to be 3D, dynamic, and semantic to meet the demands of diverse manipulation tasks. However, previous works often lack all three properties simultaneously. In this work, we introduce D$^3$Fields -- dynamic 3D descriptor fields. These fields are implicit 3D representations that take in 3D points and output semantic features and instance masks. They can also capture the dynamics of the underlying 3D environments. Specifically, we project arbitrary 3D points in the workspace onto multi-view 2D visual observations and interpolate features derived from visual foundational models. The resulting fused descriptor fields allow for flexible goal specifications using 2D images with varied contexts, styles, and instances. To evaluate the effectiveness of these descriptor fields, we apply our representation to rearrangement tasks in a zero-shot manner. Through extensive evaluation in real worlds and simulations, we demonstrate that D$^3$Fields are effective for zero-shot generalizable rearrangement tasks. We also compare D$^3$Fields with state-of-the-art implicit 3D representations and show significant improvements in effectiveness and efficiency.
comment: Accepted to Conference on Robot Learning (CoRL 2024) as Oral Presentation. The first three authors contributed equally. Project Page: https://robopil.github.io/d3fields/
Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks
Simulators are powerful tools for autonomous robot learning as they offer scalable data generation, flexible design, and optimization of trajectories. However, transferring behavior learned from simulation data into the real world proves to be difficult, usually mitigated with compute-heavy domain randomization methods or further model fine-tuning. We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks. To this end, we first build a simulator by integrating Gaussian Splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks. In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, crafty programming of expert demonstration training data, and the task understanding capabilities of Liquid networks. Through a series of quantitative flight tests, we demonstrate the robust transfer of navigation skills learned in a single simulation scene directly to the real world. We further show the ability to maintain performance beyond the training environment under drastic distribution and physical environment changes. Our learned Liquid policies, trained on single target manoeuvres curated from a photorealistic simulated indoor flight only, generalize to multi-step hikes onboard a real hardware platform outdoors.
Multiagent Systems
HEnRY: A Multi-Agent System Framework for Multi-Domain Contexts
This project, named HEnRY, aims to introduce a Multi-Agent System (MAS) into Intesa Sanpaolo. The name HEnRY summarizes the project's core principles: the Hierarchical organization of agents in a layered structure for efficient resource management; Efficient optimization of resources and operations to enhance overall performance; Reactive ability of agents to quickly respond to environmental stimuli; and Yielding adaptability and flexibility of agents to handle unexpected situations. The discussion covers two distinct research paths: the first focuses on the system architecture, and the second on the collaboration between agents. This work is not limited to the specific structure of the Intesa Sanpaolo context; instead, it leverages existing research in MAS to introduce a new solution. Since Intesa Sanpaolo is organized according to a model that aligns with international corporate governance best practices, this approach could also be relevant to similar scenarios.
Exploring Model Kinship for Merging Large Language Models
Model merging has become one of the key technologies for enhancing the capabilities and efficiency of Large Language Models (LLMs). However, our understanding of the expected performance gains and principles when merging any two models remains limited. In this work, we introduce model kinship, the degree of similarity or relatedness between LLMs, analogous to biological evolution. With comprehensive empirical analysis, we find that there is a certain relationship between model kinship and the performance gains after model merging, which can help guide our selection of candidate models. Inspired by this, we propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets. Specifically, we discover that using model kinship as a criterion can assist us in continuously performing model merging, alleviating the degradation (local optima) in model evolution, whereas model kinship can serve as a guide to escape these traps. Code is available at https://github.com/zjunlp/ModelKinship.
comment: Ongoing work
Nash equilibria in scalar discrete-time linear quadratic games
An open problem in linear quadratic (LQ) games has been characterizing the Nash equilibria. This problem has renewed relevance given the surge of work on understanding the convergence of learning algorithms in dynamic games. This paper investigates scalar discrete-time infinite-horizon LQ games with two agents. Even in this arguably simple setting, there are no results for finding $\textit{all}$ Nash equilibria. By analyzing the best response map, we formulate a polynomial system of equations characterizing the linear feedback Nash equilibria. This enables us to bring in tools from algebraic geometry, particularly the Gr\"obner basis, to study the roots of this polynomial system. Consequently, we can not only compute all Nash equilibria numerically, but we can also characterize their number with explicit conditions. For instance, we prove that the LQ games under consideration admit at most three Nash equilibria. We further provide sufficient conditions for the existence of at most two Nash equilibria and sufficient conditions for the uniqueness of the Nash equilibrium. Our numerical experiments demonstrate the tightness of our bounds and showcase the increased complexity in settings with more than two agents.
Counterfactual Effect Decomposition in Multi-Agent Sequential Decision Making
We address the challenge of explaining counterfactual outcomes in multi-agent Markov decision processes. In particular, we aim to explain the total counterfactual effect of an agent's action on the outcome of a realized scenario through its influence on the environment dynamics and the agents' behavior. To achieve this, we introduce a novel causal explanation formula that decomposes the counterfactual effect by attributing to each agent and state variable a score reflecting their respective contributions to the effect. First, we show that the total counterfactual effect of an agent's action can be decomposed into two components: one measuring the effect that propagates through all subsequent agents' actions and another related to the effect that propagates through the state transitions. Building on recent advancements in causal contribution analysis, we further decompose these two effects as follows. For the former, we consider agent-specific effects -- a causal concept that quantifies the counterfactual effect of an agent's action that propagates through a subset of agents. Based on this notion, we use Shapley value to attribute the effect to individual agents. For the latter, we consider the concept of structure-preserving interventions and attribute the effect to state variables based on their "intrinsic" contributions. Through extensive experimentation, we demonstrate the interpretability of our decomposition approach in a Gridworld environment with LLM-assisted agents and a sepsis management simulator.
Aegis:An Advanced LLM-Based Multi-Agent for Intelligent Functional Safety Engineering
Functional safety is a critical aspect of automotive engineering, encompassing all phases of a vehicle's lifecycle, including design, development, production, operation, and decommissioning. This domain involves highly knowledge-intensive tasks. This paper introduces Aegis: An Advanced LLM-Based Multi-Agent for Intelligent Functional Safety Engineering. Aegis is specifically designed to support complex functional safety tasks within the automotive sector. It is tailored to perform Hazard Analysis and Risk Assessment(HARA), document Functional Safety Requirements(FSR), and plan test cases for Automatic Emergency Braking(AEB) systems. The most advanced version, Aegis-Max, leverages Retrieval-Augmented Generation(RAG) and reflective mechanisms to enhance its capability in managing complex, knowledge-intensive tasks. Additionally, targeted prompt refinement by professional functional safety practitioners can significantly optimize Aegis's performance in the functional safety domain. This paper demonstrates the potential of Aegis to improve the efficiency and effectiveness of functional safety processes in automotive engineering.
Enhancing LLM Trading Performance with Fact-Subjectivity Aware Reasoning
While many studies prove more advanced LLMs perform better on tasks such as math and trading, we notice that in cryptocurrency trading, stronger LLMs work worse than weaker LLMs often. To study how this counter-intuitive phenomenon occurs, we examine the LLM reasoning processes on making trading decisions. We find that separating the reasoning process into factual and subjective components can lead to higher profits. Building on this insight, we introduce a multi-agent framework, FS-ReasoningAgent, which enables LLMs to recognize and learn from both factual and subjective reasoning. Extensive experiments demonstrate that this framework enhances LLM trading performance in cryptocurrency markets. Additionally, an ablation study reveals that relying on subjective news tends to generate higher returns in bull markets, whereas focusing on factual information yields better results in bear markets. Our code and data are available at \url{https://anonymous.4open.science/r/FS-ReasoningAgent-B55F/}.
Corridor Generating Algorithm for Multi-Agent Pathfinding
In this paper, we solve the classical Multi-agent Pathfinding (MAPF) problem. Existing approaches struggle to solve dense MAPF instances. In this paper, we propose a Corridor Generating Algorithm for MAPF, namely CGA-MAPF. In CGA-MAPF, the agents build \emph{corridors}, a set of connected vertices, from current locations towards agents' goals and evacuate other agents out of the corridors to avoid collisions and deadlocks. The proposed algorithm has a reachability property, i.e. every agent is guaranteed to reach its goal location at some point. In the experimental section, we demonstrate that CGA-MAPF outperforms baseline algorithms in terms of success rate across diverse MAPF benchmark grids, achieving state-of-the-art performance.
Time-Varyingness in Auction Breaks Revenue Equivalence
Auction is one of the most representative buying-selling systems. A celebrated study shows that the seller's expected revenue is equal in equilibrium, regardless of the type of auction, typically first-price and second-price auctions. Here, however, we hypothesize that when some auction environments vary with time, this revenue equivalence may not be maintained. In second-price auctions, the equilibrium strategy is robustly feasible. Conversely, in first-price auctions, the buyers must continue to adapt their strategies according to the environment of the auction. Surprisingly, we prove that revenue equivalence can be broken in both directions. First-price auctions bring larger or smaller revenue than second-price auctions, case by case, depending on how the value of an item varies. Our experiments also demonstrate revenue inequivalence in various scenarios, where the value varies periodically or randomly. This study uncovers a phenomenon, the breaking of revenue equivalence by the time-varyingness in auctions, that likely occurs in real-world auctions, revealing its underlying mechanism.
comment: 11 pages, 3 figures (main); 7 pages, 1 figure (appendix)
Voter Participation Control in Online Polls
News outlets, surveyors, and other organizations often conduct polls on social networks to gain insights into public opinion. Such a poll is typically started by someone on a social network who sends it to her friends. If a person participates in the poll, the poll information gets published on her wall, which in turn enables her friends to participate, and the process continues. Eventually, a subset of the population participates in the poll, and the pollster learns the outcome of that poll. We initiate the study of a new but natural type of election control in such online elections. We study how difficult/easy it is to sway the outcome of such polls in one's favor/against (aka constructive vs destructive) by any malicious influencer who nudges/bribes people for seemingly harmless actions like non-participation. These questions are important from the standpoint of studying the power of resistance of online voting against malicious behavior. The destructive version is also important to quantify the robustness of the winner of an online voting. We show that both problems are computationally intractable even if the election is over only two candidates and the influencer has an infinite amount of money to spend (that is, every voter can be persuaded to not participate). We strengthen this result by proving that the computational task remains substantially challenging even if the underlying network is a tree. Finally, we show that there is a polynomial-time algorithm for the constructive version of the problem when we have O(1) candidates, and the treewidth of the underlying graph is O(1); the algorithm for the destructive version does not even need to assume O(1) number of candidates. Hence, we observe that the destructive version is computationally easier than the constructive version.
Using Protected Attributes to Consider Fairness in Multi-Agent Systems
Fairness in Multi-Agent Systems (MAS) has been extensively studied, particularly in reward distribution among agents in scenarios such as goods allocation, resource division, lotteries, and bargaining systems. Fairness in MAS depends on various factors, including the system's governing rules, the behaviour of the agents, and their characteristics. Yet, fairness in human society often involves evaluating disparities between disadvantaged and privileged groups, guided by principles of Equality, Diversity, and Inclusion (EDI). Taking inspiration from the work on algorithmic fairness, which addresses bias in machine learning-based decision-making, we define protected attributes for MAS as characteristics that should not disadvantage an agent in terms of its expected rewards. We adapt fairness metrics from the algorithmic fairness literature -- namely, demographic parity, counterfactual fairness, and conditional statistical parity -- to the multi-agent setting, where self-interested agents interact within an environment. These metrics allow us to evaluate the fairness of MAS, with the ultimate aim of designing MAS that do not disadvantage agents based on protected attributes.
Large Language Model-driven Multi-Agent Simulation for News Diffusion Under Different Network Structures
The proliferation of fake news in the digital age has raised critical concerns, particularly regarding its impact on societal trust and democratic processes. Diverging from conventional agent-based simulation approaches, this work introduces an innovative approach by employing a large language model (LLM)-driven multi-agent simulation to replicate complex interactions within information ecosystems. We investigate key factors that facilitate news propagation, such as agent personalities and network structures, while also evaluating strategies to combat misinformation. Through simulations across varying network structures, we demonstrate the potential of LLM-based agents in modeling the dynamics of misinformation spread, validating the influence of agent traits on the diffusion process. Our findings emphasize the advantages of LLM-based simulations over traditional techniques, as they uncover underlying causes of information spread -- such as agents promoting discussions -- beyond the predefined rules typically employed in existing agent-based models. Additionally, we evaluate three countermeasure strategies, discovering that brute-force blocking influential agents in the network or announcing news accuracy can effectively mitigate misinformation. However, their effectiveness is influenced by the network structure, highlighting the importance of considering network structure in the development of future misinformation countermeasures.
Security Threats in Agentic AI System
This research paper explores the privacy and security threats posed to an Agentic AI system with direct access to database systems. Such access introduces significant risks, including unauthorized retrieval of sensitive information, potential exploitation of system vulnerabilities, and misuse of personal or confidential data. The complexity of AI systems combined with their ability to process and analyze large volumes of data increases the chances of data leaks or breaches, which could occur unintentionally or through adversarial manipulation. Furthermore, as AI agents evolve with greater autonomy, their capacity to bypass or exploit security measures becomes a growing concern, heightening the need to address these critical vulnerabilities in agentic systems.
comment: 8 pages, 3 figures
I Want to Break Free! Persuasion and Anti-Social Behavior of LLMs in Multi-Agent Settings with Social Hierarchy
As Large Language Model (LLM)-based agents become increasingly autonomous and will more freely interact with each other, studying interactions between them becomes crucial to anticipate emergent phenomena and potential risks. Drawing inspiration from the widely popular Stanford Prison Experiment, we contribute to this line of research by studying interaction patterns of LLM agents in a context characterized by strict social hierarchy. We do so by specifically studying two types of phenomena: persuasion and anti-social behavior in simulated scenarios involving a guard and a prisoner agent who seeks to achieve a specific goal (i.e., obtaining additional yard time or escape from prison). Leveraging 200 experimental scenarios for a total of 2,000 machine-machine conversations across five different popular LLMs, we provide a set of noteworthy findings. We first document how some models consistently fail in carrying out a conversation in our multi-agent setup where power dynamics are at play. Then, for the models that were able to engage in successful interactions, we empirically show how the goal that an agent is set to achieve impacts primarily its persuasiveness, while having a negligible effect with respect to the agent's anti-social behavior. Third, we highlight how agents' personas, and particularly the guard's personality, drive both the likelihood of successful persuasion from the prisoner and the emergence of anti-social behaviors. Fourth, we show that even without explicitly prompting for specific personalities, anti-social behavior emerges by simply assigning agents' roles. These results bear implications for the development of interactive LLM agents as well as the debate on their societal impact.
Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget
Complex multi-objective missions require the coordination of heterogeneous robots at multiple inter-connected levels, such as coalition formation, scheduling, and motion planning. The associated challenges are exacerbated when solutions to these interconnected problems need to simultaneously maximize task performance and respect practical constraints on time and resources. In this work, we formulate a new class of spatiotemporal heterogeneous task allocation problems that formalize these complexities. We then contribute a novel framework, named Quality-Optimized Incremental Task Allocation Graph Search (Q-ITAGS), to solve such problems. Q-ITAGS offers a flexible interleaved framework that i) explicitly models and optimizes the effect of the collective capabilities on task performance via learnable trait-quality maps, and ii) respects both resource and spatiotemporal constraints including a user-specified time budget (i.e. maximum makespan). In addition to algorithmic contributions, we derive theoretical suboptimality bounds in terms of task performance that varies as a function of a single hyperparameter. Detailed experiments involving a simulated emergency response task and a real-world video game dataset reveal that i) Q-ITAGS results in superior team performance compared to a state-of-the-art method, while also respecting complex spatiotemporal and resource constraints, ii) Q-ITAGS efficiently learns trait-quality maps to enable effective trade-off between task performance and resource constraints, and iii) Q-ITAGS suboptimality bounds consistently hold in practice.
comment: arXiv admin note: text overlap with arXiv:2209.13092
CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications
Autonomous robot operation in unstructured environments is often underpinned by spatial understanding through vision. Systems composed of multiple concurrently operating robots additionally require access to frequent, accurate and reliable pose estimates. In this work, we propose CoViS-Net, a decentralized visual spatial foundation model that learns spatial priors from data, enabling pose estimation as well as spatial comprehension. Our model is fully decentralized, platform-agnostic, executable in real-time using onboard compute, and does not require existing networking infrastructure. CoViS-Net provides relative pose estimates and a local bird's-eye-view (BEV) representation, even without camera overlap between robots (in contrast to classical methods). We demonstrate its use in a multi-robot formation control task across various real-world settings. We provide code, models and supplementary material online. https://proroklab.github.io/CoViS-Net/
Learning to Control and Coordinate Mixed Traffic Through Robot Vehicles at Complex and Unsignalized Intersections
Intersections are essential road infrastructures for traffic in modern metropolises. However, they can also be the bottleneck of traffic flows as a result of traffic incidents or the absence of traffic coordination mechanisms such as traffic lights. Recently, various control and coordination mechanisms that are beyond traditional control methods have been proposed to improve the efficiency of intersection traffic. Amongst these methods, the control of foreseeable mixed traffic that consists of human-driven vehicles (HVs) and robot vehicles (RVs) has emerged. In this project, we propose a decentralized multi-agent reinforcement learning approach for the control and coordination of mixed traffic at real-world, complex intersections--a topic that has not been previously explored. Comprehensive experiments are conducted to show the effectiveness of our approach. In particular, we show that using 5% RVs, we can prevent congestion formation inside a complex intersection under the actual traffic demand of 700 vehicles per hour. In contrast, without RVs, congestion starts to develop when the traffic demand reaches as low as 200 vehicles per hour. When there exist more than 60% RVs in traffic, our method starts to achieve comparable or even better performance to traffic signals on the average waiting time of all vehicles at the intersection. Our method is also robust against both blackout events and sudden RV percentage drops, and enjoys excellent generalizablility, which is illustrated by its successful deployment in two unseen intersections.
comment: This paper introduces the first method to control and coordinate mixed traffic (i.e., human-driven vehicles and robot vehicles) at unsignalized intersections with both complicated topology and real-world traffic demands. The International Journal of Robotics Research. 2024;0(0)
Systems and Control (CS)
Circulating Currents in Windings: Fundamental Property
Circulating currents in windings refer to unwanted electrical currents flowing between the parallel conductors of a winding. These currents arise due to several phenomena such as asymmetries, imperfections in the winding layout, and differences in electric potential between the parallel conductors. This effect is visible typically in windings of transformers, motors, or generators. At on-load condition, this is equivalent to having a current unevenly distributed between parallel conductors. Circulating currents have two main drawbacks: increased losses in windings and potential degradation of insulation over time. The former is an intuitive property that is widely acknowledged in the literature. This paper presents a formal proof of this fundamental property, building upon the authors' previous work and embedding it within a rigorous mathematical framework. The mathematical definition of circulating currents is provided, along with a case application in an electric machine.
Zeroth-Order Feedback Optimization in Multi-Agent Systems: Tackling Coupled Constraints
This paper investigates distributed zeroth-order feedback optimization in multi-agent systems with coupled constraints, where each agent operates its local action vector and observes only zeroth-order information to minimize a global cost function subject to constraints in which the local actions are coupled. Specifically, we employ two-point zeroth-order gradient estimation with delayed information to construct stochastic gradients, and leverage the constraint extrapolation technique and the averaging consensus framework to effectively handle the coupled constraints. We also provide convergence rate and oracle complexity results for our algorithm, characterizing its computational efficiency and scalability by rigorous theoretical analysis. Numerical experiments are conducted to validate the algorithm's effectiveness.
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
Optimal Network Expansion Planning Considering Uncertain Dynamic Thermal Line Rating
This paper examines the integrated generation and transmission expansion planning problem to address the growing challenges associated with increasing power network loads. The proposed approach optimizes the operation and investment costs for new generation units and transmission lines, while also considering the environmental benefits of integrating renewable energy sources (RES) and the impact of electric vehicle (EV) charging on the grid. The inherent uncertainties in demand, EV charging loads, and RES generation are managed using a hybrid stochastic-robust optimization approach. Additionally, the model integrates Dynamic Thermal Line Rating (DTLR) to improve the efficiency and resilience of transmission lines. The framework also tackles the uncertainty related to DTLR, incorporating a heuristic linearization technique to reduce model complexity. The effectiveness of the proposed model and techniques is evaluated through simulations conducted on two case studies: the modified IEEE 6-bus system and the IEEE 24-bus Reliability Test System.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Mean Field-based Dynamic Backoff Optimization for MIMO-enabled Grant-Free NOMA in Massive IoT Networks
In the 6G Internet of Things (IoT) paradigm, unprecedented challenges will be raised to provide massive connectivity, ultra-low latency, and energy efficiency for ultra-dense IoT devices. To address these challenges, we explore the non-orthogonal multiple access (NOMA) based grant-free random access (GFRA) schemes in the cellular uplink to support massive IoT devices with high spectrum efficiency and low access latency. In particular, we focus on optimizing the backoff strategy of each device when transmitting time-sensitive data samples to a multiple-input multiple-output (MIMO)-enabled base station subject to energy constraints. To cope with the dynamic varied channel and the severe uplink interference due to the uncoordinated grant-free access, we formulate the optimization problem as a multi-user non-cooperative dynamic stochastic game (MUN-DSG). To avoid dimensional disaster as the device number grows large, the optimization problem is transformed into a mean field game (MFG), and its Nash equilibrium can be achieved by solving the corresponding Hamilton-Jacobi-Bellman (HJB) and Fokker-Planck-Kolmogorov (FPK) equations. Thus, a Mean Field-based Dynamic Backoff (MFDB) scheme is proposed as the optimal GFRA solution for each device. Extensive simulation has been fulfilled to compare the proposed MFDB with contemporary random access approaches like access class barring (ACB), slotted-Additive Links On-line Hawaii Area (ALOHA), and minimum backoff (MB) under both static and dynamic channels, and the results proved that MFDB can achieve the least access delay and cumulated cost during multiple transmission frames. Keywords: 6G; Internet of Things; grant-free random access; NOMA; dynamic backoff
comment: 31 pages, 13 figures
Modeling, Prediction and Risk Management of Distribution System Voltages with Non-Gaussian Probability Distributions
High renewable energy penetration into power distribution systems causes a substantial risk of exceeding voltage security limits, which needs to be accurately assessed and properly managed. However, the existing methods usually rely on the joint probability models of power generation and loads provided by probabilistic prediction to quantify the voltage risks, where inaccurate prediction results could lead to over or under estimated risks. This paper proposes an uncertain voltage component (UVC) prediction method for assessing and managing voltage risks. First, we define the UVC to evaluate voltage variations caused by the uncertainties associated with power generation and loads. Second, we propose a Gaussian mixture model-based probabilistic UVC prediction method to depict the non-Gaussian distribution of voltage variations. Then, we derive the voltage risk indices, including value-at-risk (VaR) and conditional value-at-risk (CVaR), based on the probabilistic UVC prediction model. Third, we investigate the mechanism of UVC-based voltage risk management and establish the voltage risk management problems, which are reformulated into linear programming or mixed-integer linear programming for convenient solutions. The proposed method is tested on power distribution systems with actual photovoltaic power and load data and compared with those considering probabilistic prediction of nodal power injections. Numerical results show that the proposed method is computationally efficient in assessing voltage risks and outperforms existing methods in managing voltage risks. The deviation of voltage risks obtained by the proposed method is only 15% of that by the methods based on probabilistic prediction of nodal power injections.
A Control Theoretic Study on Omnidirectional MAVs with Minimum Number of Actuators and No Internal Forces at Any Orientation
We propose a new multirotor aerial vehicle class of designs composed of a multi-body structure in which a main body is connected by passive joints to links equipped with propellers. We have investigated some instances of such class, some of which are shown to achieve omnidirectionality while having a minimum number of inputs equal to the main body Degrees of Freedom DoF's, only uni-directional positive thrust propellers, and no internal forces generated at steady state. After dynamics are derived following the Euler-Lagrange approach, an I/O dynamic feedback linearization strategy is then used to show the controllability of any desired pose with stable zero dynamics. We finally verify the developed controller with closed-loop simulations.
Design and Analysis of a Metamaterial-Inspired Absorber for data rate in 52% RF-to-DC conversion Efficiency Dual-band SWIPT system
This paper proposes a novel metamaterial-inspired absorber designed to enhance the data rate in 52% RF to DC conversion simultaneous wireless information and power transfer system (SWIPT) through biological tissue. The proposed absorber includes split-ring resonators(SRRs) and demonstrates significant permeability characteristics, with both the real and imaginary parts being negative and close to -1. It also improves isolation by around 5dB in a WPT distance of 9mm. A 5mm thick phantom is used for biological tissue in this study. Experimental results exhibits that the SWIPT systems including a rectifier that converts 52% RF to DC efficiency in a WPT distance of 9mm embedding this absorber between power and signal ports at Tx side results in a 5dB improvement in isolation performance. By using proposed absorber, it enables a 7MB/s improvement of data rate and allows signals to be transmitted with 5dBm weaker power than without absorber SWIPT system.
AoI-Aware Resource Allocation for Smart Multi-QoS Provisioning
The Age of Information (AoI) has recently gained recognition as a critical quality-of-service (QoS) metric for quantifying the freshness of status updates, playing a crucial role in supporting massive ultra-reliable and low-latency communications (mURLLC) services. In mURLLC scenarios, due to the inherent system dynamics and varying environmental conditions, optimizing AoI under such multi-QoS constraints considering both delay and reliability often results in non-convex and computationally intractable problems. Motivated by the demonstrated efficacy of deep reinforcement learning (DRL) in addressing large-scale networking challenges, this work aims to apply DRL techniques to derive optimal resource allocation solutions in real time. Despite its potential, the effective integration of FBC in DRL-based AoI optimization remains underexplored, especially in addressing the challenge of simultaneously upper-bounding both delay and error-rate. To address these challenges, we propose a DRL-based framework for AoI-aware optimal resource allocation in mURLLC-driven multi-QoS schemes, leveraging AoI as a core metric within the finite blocklength regime. First, we design a wireless communication architecture and AoI-based modeling framework that incorporates FBC. Second, we proceed by deriving upper-bounded peak AoI and delay violation probabilities using stochastic network calculus (SNC). Subsequently, we formulate an optimization problem aimed at minimizing the peak AoI violation probability through FBC. Third, we develop DRL algorithms to determine optimal resource allocation policies that meet statistical delay and error-rate requirements for mURLLC. Finally, to validate the effectiveness of the developed schemes, we have executed a series of simulations.
GAN Based Top-Down View Synthesis in Reinforcement Learning Environments
Human actions are based on the mental perception of the environment. Even when all the aspects of an environment are not visible, humans have an internal mental model that can generalize the partially visible scenes to fully constructed and connected views. This internal mental model uses learned abstract representations of spatial and temporal aspects of the environments encountered in the past. Artificial agents in reinforcement learning environments also benefit by learning a representation of the environment from experience. It provides the agent with viewpoints that are not directly visible to it, helping it make better policy decisions. It can also be used to predict the future state of the environment. This project explores learning the top-down view of an RL environment based on the artificial agent's first-person view observations with a generative adversarial network(GAN). The top-down view is useful as it provides a complete overview of the environment by building a map of the entire environment. It provides information about the objects' dimensions and shapes along with their relative positions with one another. Initially, when only a partial observation of the environment is visible to the agent, only a partial top-down view is generated. As the agent explores the environment through a set of actions, the generated top-down view becomes complete. This generated top-down view can assist the agent in deducing better policy decisions. The focus of the project is to learn the top-down view of an RL environment. It doesn't deal with any Reinforcement Learning task.
A Hierarchical DRL Approach for Resource Optimization in Multi-RIS Multi-Operator Networks
As reconfigurable intelligent surfaces (RIS) emerge as a pivotal technology in the upcoming sixth-generation (6G) networks, their deployment within practical multiple operator (OP) networks presents significant challenges, including the coordination of RIS configurations among OPs, interference management, and privacy maintenance. A promising strategy is to treat RIS as a public resource managed by an RIS provider (RP), which can enhance resource allocation efficiency by allowing dynamic access for multiple OPs. However, the intricate nature of coordinating management and optimizing RIS configurations significantly complicates the implementation process. In this paper, we propose a hierarchical deep reinforcement learning (HDRL) approach that decomposes the complicated RIS resource optimization problem into several subtasks. Specifically, a top-level RP-agent is responsible for RIS allocation, while low-level OP-agents control their assigned RISs and handle beamforming, RIS phase-shifts, and user association. By utilizing the semi-Markov decision process (SMDP) theory, we establish a sophisticated interaction mechanism between the RP and OPs, and introduce an advanced hierarchical proximal policy optimization (HPPO) algorithm. Furthermore, we propose an improved sequential-HPPO (S-HPPO) algorithm to address the curse of dimensionality encountered with a single RP-agent. Experimental results validate the stability of the HPPO algorithm across various environmental parameters, demonstrating its superiority over other benchmarks for joint resource optimization. Finally, we conduct a detailed comparative analysis between the proposed S-HPPO and HPPO algorithms, showcasing that the S-HPPO algorithm achieves faster convergence and improved performance in large-scale RIS allocation scenarios.
AI-Aided Kalman Filters
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing. These methods are used for state estimation of dynamic systems by relying on mathematical representations in the form of simple state-space (SS) models, which may be crude and inaccurate descriptions of the underlying dynamics. Emerging data-centric artificial intelligence (AI) techniques tackle these tasks using deep neural networks (DNNs), which are model-agnostic. Recent developments illustrate the possibility of fusing DNNs with classic Kalman-type filtering, obtaining systems that learn to track in partially known dynamics. This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms. We review both generic and dedicated DNN architectures suitable for state estimation, and provide a systematic presentation of techniques for fusing AI tools with KFs and for leveraging partial SS modeling and data, categorizing design approaches into task-oriented and SS model-oriented. The usefulness of each approach in preserving the individual strengths of model-based KFs and data-driven DNNs is investigated in a qualitative and quantitative study, whose code is publicly available, illustrating the gains of hybrid model-based/data-driven designs. We also discuss existing challenges and future research directions that arise from fusing AI and Kalman-type algorithms.
comment: Submitted to IEEE Signal Processing Magazine
Using Intermittent Chaotic Clocks to Secure Cryptographic Chips
This letter proposes using intermittent chaotic clocks, generated from chaotic maps, to drive cryptographic chips running the Advanced Encryption Standard as a countermeasure against Correlation Power Analysis attacks. Five different chaotic maps -- namely: the Logistic map, the Bernoulli shift map, the Henon map, the Tent map, and the Ikeda map -- are used in this work to generate chaotic clocks. The performance of these chaotic clocks is evaluated in terms of timing overhead and the resilience of the driven chip against Correlation Power Analysis attacks. All proposed chaotic clocking schemes successfully protect the driven chip against attacks, with the clocks produced by the optimized Ikeda, Henon, and Logistic maps achieving the lowest timing overhead. These optimized maps, due to their intermittent chaotic behavior, exhibit lower timing overhead compared to previous work. Notably, the chaotic clock generated by the optimized Ikeda map approaches the theoretical limit of timing overhead, i.e., half the execution time of a reference periodic clock.
Vehicle Localization in GPS-Denied Scenarios Using Arc-Length-Based Map Matching
Automated driving systems face challenges in GPS-denied situations. To address this issue, kinematic dead reckoning is implemented using measurements from the steering angle, steering rate, yaw rate, and wheel speed sensors onboard the vehicle. However, dead reckoning methods suffer from drift. This paper provides an arc-length-based map matching method that uses a digital 2D map of the scenario in order to correct drift in the dead reckoning estimate. The kinematic model's prediction is used to introduce a temporal notion to the spatial information available in the map data. Results show reliable improvement in drift for all GPS-denied scenarios tested in this study. This innovative approach ensures that automated vehicles can maintain continuous and reliable navigation, significantly enhancing their safety and operational reliability in environments where GPS signals are compromised or unavailable.
Towards Large Scale Atomic Manufacturing: Heterodyne Grating Interferometer with Zero Dead-Zone
This paper presents a novel heterodyne grating interferometer designed to meet the precise measurement requirements of next-generation lithography systems and large-scale atomic-level manufacturing. Utilizing a dual-frequency light source, the interferometer enables simultaneous measurement of three degrees of freedom. Key advancements include a compact zero Dead-Zone optical path configuration, significantly enhancing measurement reliability by mitigating the impact of light source fluctuations and air refractive index variations. A comprehensive crosstalk error analysis was conducted, resulting in a robust correction algorithm that reduces errors to below 5%. Performance testing of the prototype, size of 90mm*90mm*40mm, demonstrated exceptional resolution (0.25 nm in the XY-axis and 0.3 nm in the Z-axis), superior linearity (6.9e-5, 8.1e-5 and 16.2e-5 for the X, Y, and Z axes, respectively), high repeatability (0.8 nm/1000 nm for the three axes) and stability (20 nm for the XY-axis and 60 nm for the Z-axis over 1000 seconds). Comparative analysis with existing measurement sensors highlights the proposed method's significant advantages in integration, multidimensional capabilities, and is expected to be widely used in fields such as integrated circuits, atomic-level manufacturing and aerospace technology.
comment: 8 pages,11 figures
RTI-NMPC for Control of Autonomous Vehicles Using Implicit Discretization Methods
Recent efforts in the development of autonomous driving technology have induced great advancements in perception, planning and control systems. Model predictive control is one of the most popular advanced control methods, but its application to nonlinear systems still depends on the development of computationally efficient methods. This work presents a nonlinear model predictive control formulation based on real-time iteration using an implicit discretization of the system's dynamics, with the objective of achieving greater prediction accuracy and lower computational cost when dealing with stiff dynamical systems, as is the case for vehicle dynamics. The proposed method is described and later evaluated on a simulation scenario considering modeling errors and external disturbances. The presented results demonstrate the effectiveness of the method when it comes to tracking a given trajectory and its low computational burden, measured in terms of execution time.
comment: This works was submitted, accepted and presented at the 2024 Simp\'osio Brasileiro de Automa\c{c}\~ao Inteligente - SBAI
Augmented Intelligence in Smart Intersections: Local Digital Twins-Assisted Hybrid Autonomous Driving
Vehicle-road collaboration is a promising approach for enhancing the safety and efficiency of autonomous driving by extending the intelligence of onboard systems to smart roadside infrastructures. The introduction of digital twins (DTs), particularly local DTs (LDTs) at the edge, in smart mobility presents a new embodiment of augmented intelligence, which could enhance information exchange and extract human driving expertise to improve onboard intelligence. This paper presents a novel LDT-assisted hybrid autonomous driving system for improving safety and efficiency in traffic intersections. By leveraging roadside units (RSUs) equipped with sensory and computing capabilities, the proposed system continuously monitors traffic, extracts human driving knowledge, and generates intersection-specific local driving agents through an offline reinforcement learning (RL) framework. When connected and automated vehicles (CAVs) pass through RSU-equipped intersections, RSUs can provide local agents to support safe and efficient driving in local areas. Meanwhile, they provide real-time cooperative perception (CP) to broaden onboard sensory horizons. The proposed LDT-assisted hybrid system is implemented with state-of-the-art products, e.g., CAVs and RSUs, and technologies, e.g., millimeter-wave (mmWave) communications. Hardware-in-the-loop (HiL) simulations and proof-of-concept (PoC) tests validate system performance from two standpoints: (i) The peak latency for CP and local agent downloading are 8.51 ms and 146 ms, respectively, aligning with 3GPP requirements for vehicle-to-everything (V2X) and model transfer use cases. Moreover, (ii) local driving agents can improve safety measures by 10% and reduce travel time by 15% compared with conventional onboard systems. The implemented prototype also demonstrates reliable real-time performance, fulfilling the targets of the proposed system design.
comment: 14 pages, 9 figures
When to Trust Your Data: Enhancing Dyna-Style Model-Based Reinforcement Learning With Data Filter
Reinforcement learning (RL) algorithms can be divided into two classes: model-free algorithms, which are sample-inefficient, and model-based algorithms, which suffer from model bias. Dyna-style algorithms combine these two approaches by using simulated data from an estimated environmental model to accelerate model-free training. However, their efficiency is compromised when the estimated model is inaccurate. Previous works address this issue by using model ensembles or pretraining the estimated model with data collected from the real environment, increasing computational and sample complexity. To tackle this issue, we introduce an out-of-distribution (OOD) data filter that removes simulated data from the estimated model that significantly diverges from data collected in the real environment. We show theoretically that this technique enhances the quality of simulated data. With the help of the OOD data filter, the data simulated from the estimated model better mimics the data collected by interacting with the real model. This improvement is evident in the critic updates compared to using the simulated data without the OOD data filter. Our experiment integrates the data filter into the model-based policy optimization (MBPO) algorithm. The results demonstrate that our method requires fewer interactions with the real environment to achieve a higher level of optimality than MBPO, even without a model ensemble.
Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control
Empowering resource-limited robots to execute computationally intensive tasks like model/learning-based algorithms is challenging. Due to the complexity of the workload characteristic, the bottlenecks in different systems can depend on application requirements, preventing a single hardware architecture from being adequate across all robotics applications. This project provides a comprehensive design space exploration to determine optimal hardware computation platforms and architectures suitable for robotic algorithms. We profile and optimize representative architectural designs across general-purpose cores and specialized accelerators. Specifically, we compare CPUs, vector machines, and domain-specialized accelerators with kernel-level benchmarks and end-to-end representative robotic workloads. Our exploration provides a quantitative performance, area, and utilization comparison and analyzes the trade-offs between these representative distinct architectural designs. We demonstrate that the variation of hardware architecture choices depends on workload characteristics and application requirements. Finally, we explore how architectural modifications and software ecosystem optimization can alleviate bottlenecks and enhance utilization.
BOXR: Body and head motion Optimization framework for eXtended Reality
The emergence of standalone XR systems has enhanced user mobility, accommodating both subtle, frequent head motions and substantial, less frequent body motions. However, the pervasively used M2D latency metric, which measures the delay between the most recent motion and its corresponding display update, only accounts for head motions. This oversight can leave users prone to motion sickness if significant body motion is involved. Although existing methods optimize M2D latency through asynchronous task scheduling and reprojection methods, they introduce challenges like resource contention between tasks and outdated pose data. These challenges are further complicated by user motion dynamics and scene changes during runtime. To address these issues, we for the first time introduce the C2D latency metric, which captures the delay caused by body motions, and present BOXR, a framework designed to co-optimize both body and head motion delays within an XR system. BOXR enhances the coordination between M2D and C2D latencies by efficiently scheduling tasks to avoid contentions while maintaining an up-to-date pose in the output frame. Moreover, BOXR incorporates a motion-driven visual inertial odometer to adjust to user motion dynamics and employs scene-dependent foveated rendering to manage changes in the scene effectively. Our evaluations show that BOXR significantly outperforms state-of-the-art solutions in 11 EuRoC MAV datasets across 4 XR applications across 3 hardware platforms. In controlled motion and scene settings, BOXR reduces M2D and C2D latencies by up to 63% and 27%, respectively and increases frame rate by up to 43%. In practical deployments, BOXR achieves substantial reductions in real-world scenarios up to 42% in M2D latency and 31% in C2D latency while maintaining remarkably low miss rates of only 1.6% for M2D requirements and 1.0% for C2D requirements.
comment: Accepted to 45th IEEE Real-Time Systems Symposium (RTSS'24)
GyroCopter: Differential Bearing Measuring Trajectory Planner for Tracking and Localizing Radio Frequency Sources
Autonomous aerial vehicles can provide efficient and effective solutions for radio frequency (RF) source tracking and localizing problems with applications ranging from wildlife conservation to search and rescue operations. Existing lightweight, low-cost, bearing measurements-based methods with a single antenna-receiver sensor system configurations necessitate in situ rotations, leading to substantial measurement acquisition times restricting searchable areas and number of measurements. We propose a GyroCopter for the task. Our approach plans the trajectory of a multi-rotor unmanned aerial vehicle (UAV) whilst utilizing UAV flight dynamics to execute a constant gyration motion to derive "pseudo-bearing" measurements to track RF sources. The gyration-based pseudo-bearing approach: i) significantly reduces the limitations associated with in situ rotation bearing; while ii) capitalizing on the simplicity, affordability, and lightweight nature of signal strength measurement acquisition hardware to estimate bearings. This method distinguishes itself from other pseudo-bearing approaches by eliminating the need for additional hardware to maintain simplicity, lightweightness and cost-effectiveness. To validate our approach, we derived the optimal rotation speed and conducted extensive simulations and field missions with our GyroCopter to track and localize multiple RF sources. The results confirm the effectiveness of our method, highlighting its potential as a practical and rapid solution for RF source localization tasks.
comment: For a demonstration video, see https://youtu.be/OkmmQjD74Us
Cyber C2: Achieving Scrutability and Agency in Cyberspace Operations
Our thesis is that operating in cyberspace is challenging because cyberspace exhibits extreme variety, high malleability, and extreme velocity. These properties make cyberspace largely inscrutable and limits one's agency in cyberspace, where agency is the ability to exert influence to transform the state or behaviour of the environment. With this thesis, we explore the nature of cyberspace, command and control (C2), and diagnose the challenges for cyber C2, with treatment to follow in future work.
comment: 16 pages. Published in proceedings of the 29th International Command and Control Research Symposium (ICCRTS), London UK, 2024
Two-Timescale Linear Stochastic Approximation: Constant Stepsizes Go a Long Way
Previous studies on two-timescale stochastic approximation (SA) mainly focused on bounding mean-squared errors under diminishing stepsize schemes. In this work, we investigate {\it constant} stpesize schemes through the lens of Markov processes, proving that the iterates of both timescales converge to a unique joint stationary distribution in Wasserstein metric. We derive explicit geometric and non-asymptotic convergence rates, as well as the variance and bias introduced by constant stepsizes in the presence of Markovian noise. Specifically, with two constant stepsizes $\alpha < \beta$, we show that the biases scale linearly with both stepsizes as $\Theta(\alpha)+\Theta(\beta)$ up to higher-order terms, while the variance of the slower iterate (resp., faster iterate) scales only with its own stepsize as $O(\alpha)$ (resp., $O(\beta)$). Unlike previous work, our results require no additional assumptions such as $\beta^2 \ll \alpha$ nor extra dependence on dimensions. These fine-grained characterizations allow tail-averaging and extrapolation techniques to reduce variance and bias, improving mean-squared error bound to $O(\beta^4 + \frac{1}{t})$ for both iterates.
Low-Power Encoding for PAM-3 DRAM Bus
The 3-level pulse amplitude modulation (PAM-3) signaling is expected to be widely used in memory interfaces for its greater voltage margins compared to PAM-4. To maximize the benefit of PAM-3, we propose three low-power data encoding algorithms: PAM3-DBI, PAM3-MF, and PAM3-SORT. With the DRAM memory traces from the gem5 computer architecture simulator running benchmarks, we evaluate the energy efficiency of our three PAM-3 encoding techniques. The experimental results show the proposed algorithms can reduce termination power for high-speed memory links significantly by 41% to 90% for benchmark programs.
comment: To appear in Proceedings of the 20th International Conference on Synthesis, Modeling, Analysis and Simulation Methods, and Applications to Circuit Design (SMACD 2024)
Kapitza-Inspired Stabilization of Non-Foster Circuits via Time Modulations
With his formal analysis in 1951, the physicist Pyotr Kapitza demonstrated that an inverted pendulum with an externally vibrating base can be stable in its upper position, thus overcoming the force of gravity. Kapitza's work is an example that an originally unstable system can become stable after a minor perturbation of its properties or initial conditions is applied. Inspired by his ideas, we show how non-Foster circuits can be stabilized with the application of external \textit{electrical vibration}, i.e., time modulations. Non-Foster circuits are highly appreciated in the engineering community since their bandwidth characteristics are not limited by passive-circuits bounds. Unfortunately, non-Foster circuits are usually unstable and they must be stabilized prior to operation. Here, we focus on the study of non-Foster $L(t)C$ circuits with time-varying inductors and time-invariant negative capacitors. We find an intrinsic connection between Kapitza's inverted pendulum and non-Foster $L(t)C$ resonators. Moreover, we show how positive time-varying modulations of $L(t)>0$ can overcome and stabilize non-Foster negative capacitances $C<0$. These findings open up an alternative manner of stabilizing electric circuits with the use of time modulations, and lay the groundwork for application of, what we coin \textit{Vibrational Electromagnetics}, in more complex media.
comment: 10 pages (7 pages main text, 3 pages supplementary materials), 4 figures
Attitude Estimation via Matrix Fisher Distributions on SO(3) Using Non-Unit Vector Measurements
This note presents a novel Bayesian attitude estimator with the matrix Fisher distribution on the special orthogonal group, which can smoothly accommodate both unit and non-unit vector measurements. The posterior attitude distribution is proven to be a matrix Fisher distribution with the assumption that non-unit vector measurement errors follow the isotropic Gaussian distributions and unit vector measurements follow the von-Mises Fisher distributions. Next, a global unscented transformation is proposed to approximate the full likelihood distribution with a matrix Fisher distribution for more generic cases of vector measurement errors following the non-isotropic Gaussian distributions. Following these, a Bayesian attitude estimator with the matrix Fisher distribution is constructed. Numerical examples are then presented. The proposed estimator exhibits advantageous performance compared with the previous attitude estimator with matrix Fisher distributions and the classic multiplicative extended Kalman filter in the case of non-unit vector measurements.
comment: 10 pages, 4 figures
Robust co-design framework for buildings operated by predictive control
Cost-effective decarbonisation of the built environment is a stepping stone to achieving net-zero carbon emissions since buildings are globally responsible for more than a quarter of global energy-related CO$_2$ emissions. Improving energy utilization and decreasing costs naturally requires considering multiple domain-specific performance criteria. The resulting problem is often computationally infeasible. The paper proposes an approach based on decomposition and selection of significant operating conditions to achieve a formulation with reduced computational complexity. We present a robust framework to optimise the physical design, the controller, and the operation of residential buildings in an integrated fashion, considering external weather conditions and time-varying electricity prices. The framework explicitly includes operational constraints and increases the utilization of the energy generated by intermittent resources. A case study illustrates the potential of co-design in enhancing the reliability, flexibility and self-sufficiency of a system operating under different conditions. Specifically, numerical results demonstrate reductions in costs up to $30$% compared to a deterministic formulation. Furthermore, the proposed approach achieves a computational time reduction of at least $10$ times lower compared to the original problem with a deterioration in the performance of only 0.6%.
CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications
Autonomous robot operation in unstructured environments is often underpinned by spatial understanding through vision. Systems composed of multiple concurrently operating robots additionally require access to frequent, accurate and reliable pose estimates. In this work, we propose CoViS-Net, a decentralized visual spatial foundation model that learns spatial priors from data, enabling pose estimation as well as spatial comprehension. Our model is fully decentralized, platform-agnostic, executable in real-time using onboard compute, and does not require existing networking infrastructure. CoViS-Net provides relative pose estimates and a local bird's-eye-view (BEV) representation, even without camera overlap between robots (in contrast to classical methods). We demonstrate its use in a multi-robot formation control task across various real-world settings. We provide code, models and supplementary material online. https://proroklab.github.io/CoViS-Net/
Sensitivity analysis and experimental evaluation of PID-like continuous sliding mode control
Continuous higher order sliding mode (CHOSM) controllers represent an efficient tool for disturbance rejection. For the systems with relative degree r, CHOSM approaches provide theoretically exact compensation of the matched Lipschitz perturbation, ensuring the finite-time convergence to the (r+1)-th sliding-mode set, by using only information on the sliding output and its derivatives up to the order (r-1). In this paper, we investigate the disturbance rejection properties of a PID-like CHOSM controller, as the simplest and intuitively clear example which incorporates nonlinear actions on the output error, its derivative, and integration of its sign. We use the harmonic balance approach and develop an analysis of propagation of the matched Lipschitz perturbation through the control loop in frequency domain. The resulted solution appears in form of the Bode-like loci which depend also on the amplitude of harmonic disturbances. Such amplitude-frequency characteristics allow certain comparability with standard disturbance sensitivity functions of a linear PID-controlled system in frequency domain. Also a simple and straightforward design procedure for the robust linear PID controller targeting the second-order system plants under investigation is provided for benchmarking. Additional (parasitic) actuator dynamics, which can lead to self-induced steady oscillations, i.e. chattering, is ditto respected. A detailed experimental case study, accomplished on an electro-mechanical actuator in the laboratory setting, highlight and make the pros and cons of both PID and CHOSM controllers well comparable for a broadband disturbance rejection.
comment: 16 pages, 11 figures
Nonlinear integral extension of PID control with improved convergence of perturbed second-order dynamic systems
Nonlinear extension of the integral part of a standard proportional-integral-derivative (PID) feedback control is proposed for the perturbed second-order systems. For the matched constant perturbations, the global asymptotic stability is shown, while for Lipschitz perturbations an ultimately bounded output error is guaranteed. It is shown that the proposed control is also applicable to second-order systems extended by additional (parasitic) actuator dynamics with low-pass characteristics, thus representing a frequently encountered application case. The proposed nonlinear control is proven to outperform its linear PID counterpart during the settling phase, i.e. at convergence of the residual output error. An experimental case study of the second-order system with an additional actuator dynamics and considerable perturbations is demonstrated to confirm and benchmark the control performance.
comment: 12 pages, 9 figures
Battlefield Transfers in Coalitional Blotto Games
In competitive resource allocation environments, agents often choose to form alliances; however, for some agents, doing so may not always be beneficial. Is there a method of forming alliances that always reward each of their members? We study this question using the framework of the coalitional Blotto game, in which two players compete against a common adversary by allocating their budgeted resources across disjoint sets of valued battlefields. On any given battlefield, the agent that allocates a greater amount of resources wins the corresponding battlefield value. Existing work has shown the surprising result that in certain game instances, if one player donates a portion of their budget to the other player, then both players win larger amounts in their separate competitions against the adversary. However, this transfer-based method of alliance formation is not always mutually beneficial, which motivates the search for alternate strategies. In this vein, we study a new method of alliance formation referred to as a joint transfer, whereby players publicly transfer battlefields and budgets between one another before they engage in their separate competitions against the adversary. We show that in almost all game instances, there exists a mutually beneficial joint transfer that strictly increases the payoff of each player.
Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences
Matching algorithms have demonstrated great success in several practical applications, but they often require centralized coordination and plentiful information. In many modern online marketplaces, agents must independently seek out and match with another using little to no information. For these kinds of settings, can we design decentralized, limited-information matching algorithms that preserve the desirable properties of standard centralized techniques? In this work, we constructively answer this question in the affirmative. We model a two-sided matching market as a game consisting of two disjoint sets of agents, referred to as proposers and acceptors, each of whom seeks to match with their most preferable partner on the opposite side of the market. However, each proposer has no knowledge of their own preferences, so they must learn their preferences while forming matches in the market. We present a simple online learning rule that guarantees a strong notion of probabilistic convergence to the welfare-maximizing equilibrium of the game, referred to as the proposer-optimal stable match. To the best of our knowledge, this represents the first completely decoupled, communication-free algorithm that guarantees probabilistic convergence to an optimal stable match, irrespective of the structure of the matching market.
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair
Surface cracks in infrastructure can lead to significant deterioration and costly maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise and thus difficult to scale to large areas. While advancements in robotic perception and manipulation have progressed autonomous crack repair, existing methods still face three key challenges: accurate localization of cracks within the robot's coordinate frame, (ii) adaptability to varying crack depths and widths, and (iii) validation of the repair process under realistic conditions. This paper presents an adaptive, autonomous system for surface crack detection and repair using robotics with advanced sensing technologies to enhance precision and safety for humans. The system uses an RGB-D camera for crack detection, a laser scanner for precise measurement, and an extruder and pump for material deposition. To address one of the key challenges, the laser scanner is used to enhance the crack coordinates for accurate localization. Furthermore, our approach demonstrates that an adaptive crack-filling method is more efficient and effective than a fixed-speed approach, with experimental results confirming both precision and consistency. In addition, to ensure real-world applicability and testing repeatability, we introduce a novel validation procedure using 3D-printed crack specimens that accurately simulate real-world conditions. This research contributes to the evolving field of human-robot interaction in construction by demonstrating how adaptive robotic systems can reduce the need for manual labor, improve safety, and enhance the efficiency of maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 22 pages, 14 figures, submitted to Advanced Engineering Informatics
Wireless Resource Optimization in Hybrid Semantic/Bit Communication Networks
Recently, semantic communication (SemCom) has shown great potential in significant resource savings and efficient information exchanges, thus naturally introducing a novel and practical cellular network paradigm where two modes of SemCom and conventional bit communication (BitCom) coexist. Nevertheless, the involved wireless resource management becomes rather complicated and challenging, given the unique background knowledge matching and time-consuming semantic coding requirements in SemCom. To this end, this paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we specially develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by utilizing a Lagrange primal-dual transformation method and a preference list-based heuristic algorithm with polynomial-time complexity. Numerical results not only demonstrate the accuracy of our analytical queuing model, but also validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been accepted for publication by the IEEE Transactions on Communications. Copyright may be transferred without notice, after which this version may no longer be accessible
Active risk aversion in SIS epidemics on networks
We present and analyze an actively controlled Susceptible-Infected-Susceptible (actSIS) model of interconnected populations to study how risk aversion strategies, such as social distancing, affect network epidemics. A population using a risk aversion strategy reduces its contact rate with other populations when it perceives an increase in infection risk. The network actSIS model relies on two distinct networks. One is a physical contact network that defines which populations come into contact with which other populations and thus how infection spreads. The other is a communication network, such as an online social network, that defines which populations observe the infection level of which other populations and thus how information spreads. We prove that the model, with these two networks and populations using risk aversion strategies, exhibits a transcritical bifurcation in which an endemic equilibrium emerges. For regular graphs, we prove that the endemic infection level is uniform across populations and reduced by the risk aversion strategy, relative to the network SIS endemic level. We show that when communication is sufficiently sparse, this initially stable equilibrium loses stability in a secondary bifurcation. Simulations show that a new stable solution emerges with nonuniform infection levels.
Systems and Control (EESS)
Circulating Currents in Windings: Fundamental Property
Circulating currents in windings refer to unwanted electrical currents flowing between the parallel conductors of a winding. These currents arise due to several phenomena such as asymmetries, imperfections in the winding layout, and differences in electric potential between the parallel conductors. This effect is visible typically in windings of transformers, motors, or generators. At on-load condition, this is equivalent to having a current unevenly distributed between parallel conductors. Circulating currents have two main drawbacks: increased losses in windings and potential degradation of insulation over time. The former is an intuitive property that is widely acknowledged in the literature. This paper presents a formal proof of this fundamental property, building upon the authors' previous work and embedding it within a rigorous mathematical framework. The mathematical definition of circulating currents is provided, along with a case application in an electric machine.
Zeroth-Order Feedback Optimization in Multi-Agent Systems: Tackling Coupled Constraints
This paper investigates distributed zeroth-order feedback optimization in multi-agent systems with coupled constraints, where each agent operates its local action vector and observes only zeroth-order information to minimize a global cost function subject to constraints in which the local actions are coupled. Specifically, we employ two-point zeroth-order gradient estimation with delayed information to construct stochastic gradients, and leverage the constraint extrapolation technique and the averaging consensus framework to effectively handle the coupled constraints. We also provide convergence rate and oracle complexity results for our algorithm, characterizing its computational efficiency and scalability by rigorous theoretical analysis. Numerical experiments are conducted to validate the algorithm's effectiveness.
A Communication Consistent Approach to Signal Temporal Logic Task Decomposition in Multi-Agent Systems
We consider the problem of decomposing a global task assigned to a multi-agent system, expressed as a formula within a fragment of Signal Temporal Logic (STL), under range-limited communication. Given a global task expressed as a conjunction of local tasks defined over the individual and relative states of agents in the system, we propose representing task dependencies among agents as edges of a suitably defined task graph. At the same time, range-limited communication naturally induces the definition of a communication graph that defines which agents have access to each other's states. Within these settings, inconsistencies arise when a task dependency between a pair of agents is not supported by a corresponding communication link due to the limited communication range. As a result, state feedback control laws previously derived to achieve the tasks' satisfaction can not be leveraged. We propose a task decomposition mechanism to distribute tasks assigned to pairs of non-communicating agents in the system as conjunctions of tasks defined over the relative states of communicating agents, thus enforcing consistency between task and communication graphs. Assuming the super-level sets of the predicate functions composing the STL tasks are bounded polytopes, our task decomposition mechanism can be cast as a parameter optimization problem and solved via state-of-the-art decentralized convex optimization algorithms. To guarantee the soundness of our approach, we present various conditions under which the tasks defined in the applied STL fragment are unsatisfiable, and we show sufficient conditions such that our decomposition approach yields satisfiable global tasks after decomposition.
Optimal Network Expansion Planning Considering Uncertain Dynamic Thermal Line Rating
This paper examines the integrated generation and transmission expansion planning problem to address the growing challenges associated with increasing power network loads. The proposed approach optimizes the operation and investment costs for new generation units and transmission lines, while also considering the environmental benefits of integrating renewable energy sources (RES) and the impact of electric vehicle (EV) charging on the grid. The inherent uncertainties in demand, EV charging loads, and RES generation are managed using a hybrid stochastic-robust optimization approach. Additionally, the model integrates Dynamic Thermal Line Rating (DTLR) to improve the efficiency and resilience of transmission lines. The framework also tackles the uncertainty related to DTLR, incorporating a heuristic linearization technique to reduce model complexity. The effectiveness of the proposed model and techniques is evaluated through simulations conducted on two case studies: the modified IEEE 6-bus system and the IEEE 24-bus Reliability Test System.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Mean Field-based Dynamic Backoff Optimization for MIMO-enabled Grant-Free NOMA in Massive IoT Networks
In the 6G Internet of Things (IoT) paradigm, unprecedented challenges will be raised to provide massive connectivity, ultra-low latency, and energy efficiency for ultra-dense IoT devices. To address these challenges, we explore the non-orthogonal multiple access (NOMA) based grant-free random access (GFRA) schemes in the cellular uplink to support massive IoT devices with high spectrum efficiency and low access latency. In particular, we focus on optimizing the backoff strategy of each device when transmitting time-sensitive data samples to a multiple-input multiple-output (MIMO)-enabled base station subject to energy constraints. To cope with the dynamic varied channel and the severe uplink interference due to the uncoordinated grant-free access, we formulate the optimization problem as a multi-user non-cooperative dynamic stochastic game (MUN-DSG). To avoid dimensional disaster as the device number grows large, the optimization problem is transformed into a mean field game (MFG), and its Nash equilibrium can be achieved by solving the corresponding Hamilton-Jacobi-Bellman (HJB) and Fokker-Planck-Kolmogorov (FPK) equations. Thus, a Mean Field-based Dynamic Backoff (MFDB) scheme is proposed as the optimal GFRA solution for each device. Extensive simulation has been fulfilled to compare the proposed MFDB with contemporary random access approaches like access class barring (ACB), slotted-Additive Links On-line Hawaii Area (ALOHA), and minimum backoff (MB) under both static and dynamic channels, and the results proved that MFDB can achieve the least access delay and cumulated cost during multiple transmission frames. Keywords: 6G; Internet of Things; grant-free random access; NOMA; dynamic backoff
comment: 31 pages, 13 figures
Modeling, Prediction and Risk Management of Distribution System Voltages with Non-Gaussian Probability Distributions
High renewable energy penetration into power distribution systems causes a substantial risk of exceeding voltage security limits, which needs to be accurately assessed and properly managed. However, the existing methods usually rely on the joint probability models of power generation and loads provided by probabilistic prediction to quantify the voltage risks, where inaccurate prediction results could lead to over or under estimated risks. This paper proposes an uncertain voltage component (UVC) prediction method for assessing and managing voltage risks. First, we define the UVC to evaluate voltage variations caused by the uncertainties associated with power generation and loads. Second, we propose a Gaussian mixture model-based probabilistic UVC prediction method to depict the non-Gaussian distribution of voltage variations. Then, we derive the voltage risk indices, including value-at-risk (VaR) and conditional value-at-risk (CVaR), based on the probabilistic UVC prediction model. Third, we investigate the mechanism of UVC-based voltage risk management and establish the voltage risk management problems, which are reformulated into linear programming or mixed-integer linear programming for convenient solutions. The proposed method is tested on power distribution systems with actual photovoltaic power and load data and compared with those considering probabilistic prediction of nodal power injections. Numerical results show that the proposed method is computationally efficient in assessing voltage risks and outperforms existing methods in managing voltage risks. The deviation of voltage risks obtained by the proposed method is only 15% of that by the methods based on probabilistic prediction of nodal power injections.
A Control Theoretic Study on Omnidirectional MAVs with Minimum Number of Actuators and No Internal Forces at Any Orientation
We propose a new multirotor aerial vehicle class of designs composed of a multi-body structure in which a main body is connected by passive joints to links equipped with propellers. We have investigated some instances of such class, some of which are shown to achieve omnidirectionality while having a minimum number of inputs equal to the main body Degrees of Freedom DoF's, only uni-directional positive thrust propellers, and no internal forces generated at steady state. After dynamics are derived following the Euler-Lagrange approach, an I/O dynamic feedback linearization strategy is then used to show the controllability of any desired pose with stable zero dynamics. We finally verify the developed controller with closed-loop simulations.
Design and Analysis of a Metamaterial-Inspired Absorber for data rate in 52% RF-to-DC conversion Efficiency Dual-band SWIPT system
This paper proposes a novel metamaterial-inspired absorber designed to enhance the data rate in 52% RF to DC conversion simultaneous wireless information and power transfer system (SWIPT) through biological tissue. The proposed absorber includes split-ring resonators(SRRs) and demonstrates significant permeability characteristics, with both the real and imaginary parts being negative and close to -1. It also improves isolation by around 5dB in a WPT distance of 9mm. A 5mm thick phantom is used for biological tissue in this study. Experimental results exhibits that the SWIPT systems including a rectifier that converts 52% RF to DC efficiency in a WPT distance of 9mm embedding this absorber between power and signal ports at Tx side results in a 5dB improvement in isolation performance. By using proposed absorber, it enables a 7MB/s improvement of data rate and allows signals to be transmitted with 5dBm weaker power than without absorber SWIPT system.
AoI-Aware Resource Allocation for Smart Multi-QoS Provisioning
The Age of Information (AoI) has recently gained recognition as a critical quality-of-service (QoS) metric for quantifying the freshness of status updates, playing a crucial role in supporting massive ultra-reliable and low-latency communications (mURLLC) services. In mURLLC scenarios, due to the inherent system dynamics and varying environmental conditions, optimizing AoI under such multi-QoS constraints considering both delay and reliability often results in non-convex and computationally intractable problems. Motivated by the demonstrated efficacy of deep reinforcement learning (DRL) in addressing large-scale networking challenges, this work aims to apply DRL techniques to derive optimal resource allocation solutions in real time. Despite its potential, the effective integration of FBC in DRL-based AoI optimization remains underexplored, especially in addressing the challenge of simultaneously upper-bounding both delay and error-rate. To address these challenges, we propose a DRL-based framework for AoI-aware optimal resource allocation in mURLLC-driven multi-QoS schemes, leveraging AoI as a core metric within the finite blocklength regime. First, we design a wireless communication architecture and AoI-based modeling framework that incorporates FBC. Second, we proceed by deriving upper-bounded peak AoI and delay violation probabilities using stochastic network calculus (SNC). Subsequently, we formulate an optimization problem aimed at minimizing the peak AoI violation probability through FBC. Third, we develop DRL algorithms to determine optimal resource allocation policies that meet statistical delay and error-rate requirements for mURLLC. Finally, to validate the effectiveness of the developed schemes, we have executed a series of simulations.
GAN Based Top-Down View Synthesis in Reinforcement Learning Environments
Human actions are based on the mental perception of the environment. Even when all the aspects of an environment are not visible, humans have an internal mental model that can generalize the partially visible scenes to fully constructed and connected views. This internal mental model uses learned abstract representations of spatial and temporal aspects of the environments encountered in the past. Artificial agents in reinforcement learning environments also benefit by learning a representation of the environment from experience. It provides the agent with viewpoints that are not directly visible to it, helping it make better policy decisions. It can also be used to predict the future state of the environment. This project explores learning the top-down view of an RL environment based on the artificial agent's first-person view observations with a generative adversarial network(GAN). The top-down view is useful as it provides a complete overview of the environment by building a map of the entire environment. It provides information about the objects' dimensions and shapes along with their relative positions with one another. Initially, when only a partial observation of the environment is visible to the agent, only a partial top-down view is generated. As the agent explores the environment through a set of actions, the generated top-down view becomes complete. This generated top-down view can assist the agent in deducing better policy decisions. The focus of the project is to learn the top-down view of an RL environment. It doesn't deal with any Reinforcement Learning task.
A Hierarchical DRL Approach for Resource Optimization in Multi-RIS Multi-Operator Networks
As reconfigurable intelligent surfaces (RIS) emerge as a pivotal technology in the upcoming sixth-generation (6G) networks, their deployment within practical multiple operator (OP) networks presents significant challenges, including the coordination of RIS configurations among OPs, interference management, and privacy maintenance. A promising strategy is to treat RIS as a public resource managed by an RIS provider (RP), which can enhance resource allocation efficiency by allowing dynamic access for multiple OPs. However, the intricate nature of coordinating management and optimizing RIS configurations significantly complicates the implementation process. In this paper, we propose a hierarchical deep reinforcement learning (HDRL) approach that decomposes the complicated RIS resource optimization problem into several subtasks. Specifically, a top-level RP-agent is responsible for RIS allocation, while low-level OP-agents control their assigned RISs and handle beamforming, RIS phase-shifts, and user association. By utilizing the semi-Markov decision process (SMDP) theory, we establish a sophisticated interaction mechanism between the RP and OPs, and introduce an advanced hierarchical proximal policy optimization (HPPO) algorithm. Furthermore, we propose an improved sequential-HPPO (S-HPPO) algorithm to address the curse of dimensionality encountered with a single RP-agent. Experimental results validate the stability of the HPPO algorithm across various environmental parameters, demonstrating its superiority over other benchmarks for joint resource optimization. Finally, we conduct a detailed comparative analysis between the proposed S-HPPO and HPPO algorithms, showcasing that the S-HPPO algorithm achieves faster convergence and improved performance in large-scale RIS allocation scenarios.
AI-Aided Kalman Filters
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing. These methods are used for state estimation of dynamic systems by relying on mathematical representations in the form of simple state-space (SS) models, which may be crude and inaccurate descriptions of the underlying dynamics. Emerging data-centric artificial intelligence (AI) techniques tackle these tasks using deep neural networks (DNNs), which are model-agnostic. Recent developments illustrate the possibility of fusing DNNs with classic Kalman-type filtering, obtaining systems that learn to track in partially known dynamics. This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms. We review both generic and dedicated DNN architectures suitable for state estimation, and provide a systematic presentation of techniques for fusing AI tools with KFs and for leveraging partial SS modeling and data, categorizing design approaches into task-oriented and SS model-oriented. The usefulness of each approach in preserving the individual strengths of model-based KFs and data-driven DNNs is investigated in a qualitative and quantitative study, whose code is publicly available, illustrating the gains of hybrid model-based/data-driven designs. We also discuss existing challenges and future research directions that arise from fusing AI and Kalman-type algorithms.
comment: Submitted to IEEE Signal Processing Magazine
Using Intermittent Chaotic Clocks to Secure Cryptographic Chips
This letter proposes using intermittent chaotic clocks, generated from chaotic maps, to drive cryptographic chips running the Advanced Encryption Standard as a countermeasure against Correlation Power Analysis attacks. Five different chaotic maps -- namely: the Logistic map, the Bernoulli shift map, the Henon map, the Tent map, and the Ikeda map -- are used in this work to generate chaotic clocks. The performance of these chaotic clocks is evaluated in terms of timing overhead and the resilience of the driven chip against Correlation Power Analysis attacks. All proposed chaotic clocking schemes successfully protect the driven chip against attacks, with the clocks produced by the optimized Ikeda, Henon, and Logistic maps achieving the lowest timing overhead. These optimized maps, due to their intermittent chaotic behavior, exhibit lower timing overhead compared to previous work. Notably, the chaotic clock generated by the optimized Ikeda map approaches the theoretical limit of timing overhead, i.e., half the execution time of a reference periodic clock.
Vehicle Localization in GPS-Denied Scenarios Using Arc-Length-Based Map Matching
Automated driving systems face challenges in GPS-denied situations. To address this issue, kinematic dead reckoning is implemented using measurements from the steering angle, steering rate, yaw rate, and wheel speed sensors onboard the vehicle. However, dead reckoning methods suffer from drift. This paper provides an arc-length-based map matching method that uses a digital 2D map of the scenario in order to correct drift in the dead reckoning estimate. The kinematic model's prediction is used to introduce a temporal notion to the spatial information available in the map data. Results show reliable improvement in drift for all GPS-denied scenarios tested in this study. This innovative approach ensures that automated vehicles can maintain continuous and reliable navigation, significantly enhancing their safety and operational reliability in environments where GPS signals are compromised or unavailable.
Towards Large Scale Atomic Manufacturing: Heterodyne Grating Interferometer with Zero Dead-Zone
This paper presents a novel heterodyne grating interferometer designed to meet the precise measurement requirements of next-generation lithography systems and large-scale atomic-level manufacturing. Utilizing a dual-frequency light source, the interferometer enables simultaneous measurement of three degrees of freedom. Key advancements include a compact zero Dead-Zone optical path configuration, significantly enhancing measurement reliability by mitigating the impact of light source fluctuations and air refractive index variations. A comprehensive crosstalk error analysis was conducted, resulting in a robust correction algorithm that reduces errors to below 5%. Performance testing of the prototype, size of 90mm*90mm*40mm, demonstrated exceptional resolution (0.25 nm in the XY-axis and 0.3 nm in the Z-axis), superior linearity (6.9e-5, 8.1e-5 and 16.2e-5 for the X, Y, and Z axes, respectively), high repeatability (0.8 nm/1000 nm for the three axes) and stability (20 nm for the XY-axis and 60 nm for the Z-axis over 1000 seconds). Comparative analysis with existing measurement sensors highlights the proposed method's significant advantages in integration, multidimensional capabilities, and is expected to be widely used in fields such as integrated circuits, atomic-level manufacturing and aerospace technology.
comment: 8 pages,11 figures
RTI-NMPC for Control of Autonomous Vehicles Using Implicit Discretization Methods
Recent efforts in the development of autonomous driving technology have induced great advancements in perception, planning and control systems. Model predictive control is one of the most popular advanced control methods, but its application to nonlinear systems still depends on the development of computationally efficient methods. This work presents a nonlinear model predictive control formulation based on real-time iteration using an implicit discretization of the system's dynamics, with the objective of achieving greater prediction accuracy and lower computational cost when dealing with stiff dynamical systems, as is the case for vehicle dynamics. The proposed method is described and later evaluated on a simulation scenario considering modeling errors and external disturbances. The presented results demonstrate the effectiveness of the method when it comes to tracking a given trajectory and its low computational burden, measured in terms of execution time.
comment: This works was submitted, accepted and presented at the 2024 Simp\'osio Brasileiro de Automa\c{c}\~ao Inteligente - SBAI
Augmented Intelligence in Smart Intersections: Local Digital Twins-Assisted Hybrid Autonomous Driving
Vehicle-road collaboration is a promising approach for enhancing the safety and efficiency of autonomous driving by extending the intelligence of onboard systems to smart roadside infrastructures. The introduction of digital twins (DTs), particularly local DTs (LDTs) at the edge, in smart mobility presents a new embodiment of augmented intelligence, which could enhance information exchange and extract human driving expertise to improve onboard intelligence. This paper presents a novel LDT-assisted hybrid autonomous driving system for improving safety and efficiency in traffic intersections. By leveraging roadside units (RSUs) equipped with sensory and computing capabilities, the proposed system continuously monitors traffic, extracts human driving knowledge, and generates intersection-specific local driving agents through an offline reinforcement learning (RL) framework. When connected and automated vehicles (CAVs) pass through RSU-equipped intersections, RSUs can provide local agents to support safe and efficient driving in local areas. Meanwhile, they provide real-time cooperative perception (CP) to broaden onboard sensory horizons. The proposed LDT-assisted hybrid system is implemented with state-of-the-art products, e.g., CAVs and RSUs, and technologies, e.g., millimeter-wave (mmWave) communications. Hardware-in-the-loop (HiL) simulations and proof-of-concept (PoC) tests validate system performance from two standpoints: (i) The peak latency for CP and local agent downloading are 8.51 ms and 146 ms, respectively, aligning with 3GPP requirements for vehicle-to-everything (V2X) and model transfer use cases. Moreover, (ii) local driving agents can improve safety measures by 10% and reduce travel time by 15% compared with conventional onboard systems. The implemented prototype also demonstrates reliable real-time performance, fulfilling the targets of the proposed system design.
comment: 14 pages, 9 figures
When to Trust Your Data: Enhancing Dyna-Style Model-Based Reinforcement Learning With Data Filter
Reinforcement learning (RL) algorithms can be divided into two classes: model-free algorithms, which are sample-inefficient, and model-based algorithms, which suffer from model bias. Dyna-style algorithms combine these two approaches by using simulated data from an estimated environmental model to accelerate model-free training. However, their efficiency is compromised when the estimated model is inaccurate. Previous works address this issue by using model ensembles or pretraining the estimated model with data collected from the real environment, increasing computational and sample complexity. To tackle this issue, we introduce an out-of-distribution (OOD) data filter that removes simulated data from the estimated model that significantly diverges from data collected in the real environment. We show theoretically that this technique enhances the quality of simulated data. With the help of the OOD data filter, the data simulated from the estimated model better mimics the data collected by interacting with the real model. This improvement is evident in the critic updates compared to using the simulated data without the OOD data filter. Our experiment integrates the data filter into the model-based policy optimization (MBPO) algorithm. The results demonstrate that our method requires fewer interactions with the real environment to achieve a higher level of optimality than MBPO, even without a model ensemble.
Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control
Empowering resource-limited robots to execute computationally intensive tasks like model/learning-based algorithms is challenging. Due to the complexity of the workload characteristic, the bottlenecks in different systems can depend on application requirements, preventing a single hardware architecture from being adequate across all robotics applications. This project provides a comprehensive design space exploration to determine optimal hardware computation platforms and architectures suitable for robotic algorithms. We profile and optimize representative architectural designs across general-purpose cores and specialized accelerators. Specifically, we compare CPUs, vector machines, and domain-specialized accelerators with kernel-level benchmarks and end-to-end representative robotic workloads. Our exploration provides a quantitative performance, area, and utilization comparison and analyzes the trade-offs between these representative distinct architectural designs. We demonstrate that the variation of hardware architecture choices depends on workload characteristics and application requirements. Finally, we explore how architectural modifications and software ecosystem optimization can alleviate bottlenecks and enhance utilization.
BOXR: Body and head motion Optimization framework for eXtended Reality
The emergence of standalone XR systems has enhanced user mobility, accommodating both subtle, frequent head motions and substantial, less frequent body motions. However, the pervasively used M2D latency metric, which measures the delay between the most recent motion and its corresponding display update, only accounts for head motions. This oversight can leave users prone to motion sickness if significant body motion is involved. Although existing methods optimize M2D latency through asynchronous task scheduling and reprojection methods, they introduce challenges like resource contention between tasks and outdated pose data. These challenges are further complicated by user motion dynamics and scene changes during runtime. To address these issues, we for the first time introduce the C2D latency metric, which captures the delay caused by body motions, and present BOXR, a framework designed to co-optimize both body and head motion delays within an XR system. BOXR enhances the coordination between M2D and C2D latencies by efficiently scheduling tasks to avoid contentions while maintaining an up-to-date pose in the output frame. Moreover, BOXR incorporates a motion-driven visual inertial odometer to adjust to user motion dynamics and employs scene-dependent foveated rendering to manage changes in the scene effectively. Our evaluations show that BOXR significantly outperforms state-of-the-art solutions in 11 EuRoC MAV datasets across 4 XR applications across 3 hardware platforms. In controlled motion and scene settings, BOXR reduces M2D and C2D latencies by up to 63% and 27%, respectively and increases frame rate by up to 43%. In practical deployments, BOXR achieves substantial reductions in real-world scenarios up to 42% in M2D latency and 31% in C2D latency while maintaining remarkably low miss rates of only 1.6% for M2D requirements and 1.0% for C2D requirements.
comment: Accepted to 45th IEEE Real-Time Systems Symposium (RTSS'24)
GyroCopter: Differential Bearing Measuring Trajectory Planner for Tracking and Localizing Radio Frequency Sources
Autonomous aerial vehicles can provide efficient and effective solutions for radio frequency (RF) source tracking and localizing problems with applications ranging from wildlife conservation to search and rescue operations. Existing lightweight, low-cost, bearing measurements-based methods with a single antenna-receiver sensor system configurations necessitate in situ rotations, leading to substantial measurement acquisition times restricting searchable areas and number of measurements. We propose a GyroCopter for the task. Our approach plans the trajectory of a multi-rotor unmanned aerial vehicle (UAV) whilst utilizing UAV flight dynamics to execute a constant gyration motion to derive "pseudo-bearing" measurements to track RF sources. The gyration-based pseudo-bearing approach: i) significantly reduces the limitations associated with in situ rotation bearing; while ii) capitalizing on the simplicity, affordability, and lightweight nature of signal strength measurement acquisition hardware to estimate bearings. This method distinguishes itself from other pseudo-bearing approaches by eliminating the need for additional hardware to maintain simplicity, lightweightness and cost-effectiveness. To validate our approach, we derived the optimal rotation speed and conducted extensive simulations and field missions with our GyroCopter to track and localize multiple RF sources. The results confirm the effectiveness of our method, highlighting its potential as a practical and rapid solution for RF source localization tasks.
comment: For a demonstration video, see https://youtu.be/OkmmQjD74Us
Cyber C2: Achieving Scrutability and Agency in Cyberspace Operations
Our thesis is that operating in cyberspace is challenging because cyberspace exhibits extreme variety, high malleability, and extreme velocity. These properties make cyberspace largely inscrutable and limits one's agency in cyberspace, where agency is the ability to exert influence to transform the state or behaviour of the environment. With this thesis, we explore the nature of cyberspace, command and control (C2), and diagnose the challenges for cyber C2, with treatment to follow in future work.
comment: 16 pages. Published in proceedings of the 29th International Command and Control Research Symposium (ICCRTS), London UK, 2024
Two-Timescale Linear Stochastic Approximation: Constant Stepsizes Go a Long Way
Previous studies on two-timescale stochastic approximation (SA) mainly focused on bounding mean-squared errors under diminishing stepsize schemes. In this work, we investigate {\it constant} stpesize schemes through the lens of Markov processes, proving that the iterates of both timescales converge to a unique joint stationary distribution in Wasserstein metric. We derive explicit geometric and non-asymptotic convergence rates, as well as the variance and bias introduced by constant stepsizes in the presence of Markovian noise. Specifically, with two constant stepsizes $\alpha < \beta$, we show that the biases scale linearly with both stepsizes as $\Theta(\alpha)+\Theta(\beta)$ up to higher-order terms, while the variance of the slower iterate (resp., faster iterate) scales only with its own stepsize as $O(\alpha)$ (resp., $O(\beta)$). Unlike previous work, our results require no additional assumptions such as $\beta^2 \ll \alpha$ nor extra dependence on dimensions. These fine-grained characterizations allow tail-averaging and extrapolation techniques to reduce variance and bias, improving mean-squared error bound to $O(\beta^4 + \frac{1}{t})$ for both iterates.
Low-Power Encoding for PAM-3 DRAM Bus
The 3-level pulse amplitude modulation (PAM-3) signaling is expected to be widely used in memory interfaces for its greater voltage margins compared to PAM-4. To maximize the benefit of PAM-3, we propose three low-power data encoding algorithms: PAM3-DBI, PAM3-MF, and PAM3-SORT. With the DRAM memory traces from the gem5 computer architecture simulator running benchmarks, we evaluate the energy efficiency of our three PAM-3 encoding techniques. The experimental results show the proposed algorithms can reduce termination power for high-speed memory links significantly by 41% to 90% for benchmark programs.
comment: To appear in Proceedings of the 20th International Conference on Synthesis, Modeling, Analysis and Simulation Methods, and Applications to Circuit Design (SMACD 2024)
Kapitza-Inspired Stabilization of Non-Foster Circuits via Time Modulations
With his formal analysis in 1951, the physicist Pyotr Kapitza demonstrated that an inverted pendulum with an externally vibrating base can be stable in its upper position, thus overcoming the force of gravity. Kapitza's work is an example that an originally unstable system can become stable after a minor perturbation of its properties or initial conditions is applied. Inspired by his ideas, we show how non-Foster circuits can be stabilized with the application of external \textit{electrical vibration}, i.e., time modulations. Non-Foster circuits are highly appreciated in the engineering community since their bandwidth characteristics are not limited by passive-circuits bounds. Unfortunately, non-Foster circuits are usually unstable and they must be stabilized prior to operation. Here, we focus on the study of non-Foster $L(t)C$ circuits with time-varying inductors and time-invariant negative capacitors. We find an intrinsic connection between Kapitza's inverted pendulum and non-Foster $L(t)C$ resonators. Moreover, we show how positive time-varying modulations of $L(t)>0$ can overcome and stabilize non-Foster negative capacitances $C<0$. These findings open up an alternative manner of stabilizing electric circuits with the use of time modulations, and lay the groundwork for application of, what we coin \textit{Vibrational Electromagnetics}, in more complex media.
comment: 10 pages (7 pages main text, 3 pages supplementary materials), 4 figures
Attitude Estimation via Matrix Fisher Distributions on SO(3) Using Non-Unit Vector Measurements
This note presents a novel Bayesian attitude estimator with the matrix Fisher distribution on the special orthogonal group, which can smoothly accommodate both unit and non-unit vector measurements. The posterior attitude distribution is proven to be a matrix Fisher distribution with the assumption that non-unit vector measurement errors follow the isotropic Gaussian distributions and unit vector measurements follow the von-Mises Fisher distributions. Next, a global unscented transformation is proposed to approximate the full likelihood distribution with a matrix Fisher distribution for more generic cases of vector measurement errors following the non-isotropic Gaussian distributions. Following these, a Bayesian attitude estimator with the matrix Fisher distribution is constructed. Numerical examples are then presented. The proposed estimator exhibits advantageous performance compared with the previous attitude estimator with matrix Fisher distributions and the classic multiplicative extended Kalman filter in the case of non-unit vector measurements.
comment: 10 pages, 4 figures
Robust co-design framework for buildings operated by predictive control
Cost-effective decarbonisation of the built environment is a stepping stone to achieving net-zero carbon emissions since buildings are globally responsible for more than a quarter of global energy-related CO$_2$ emissions. Improving energy utilization and decreasing costs naturally requires considering multiple domain-specific performance criteria. The resulting problem is often computationally infeasible. The paper proposes an approach based on decomposition and selection of significant operating conditions to achieve a formulation with reduced computational complexity. We present a robust framework to optimise the physical design, the controller, and the operation of residential buildings in an integrated fashion, considering external weather conditions and time-varying electricity prices. The framework explicitly includes operational constraints and increases the utilization of the energy generated by intermittent resources. A case study illustrates the potential of co-design in enhancing the reliability, flexibility and self-sufficiency of a system operating under different conditions. Specifically, numerical results demonstrate reductions in costs up to $30$% compared to a deterministic formulation. Furthermore, the proposed approach achieves a computational time reduction of at least $10$ times lower compared to the original problem with a deterioration in the performance of only 0.6%.
CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications
Autonomous robot operation in unstructured environments is often underpinned by spatial understanding through vision. Systems composed of multiple concurrently operating robots additionally require access to frequent, accurate and reliable pose estimates. In this work, we propose CoViS-Net, a decentralized visual spatial foundation model that learns spatial priors from data, enabling pose estimation as well as spatial comprehension. Our model is fully decentralized, platform-agnostic, executable in real-time using onboard compute, and does not require existing networking infrastructure. CoViS-Net provides relative pose estimates and a local bird's-eye-view (BEV) representation, even without camera overlap between robots (in contrast to classical methods). We demonstrate its use in a multi-robot formation control task across various real-world settings. We provide code, models and supplementary material online. https://proroklab.github.io/CoViS-Net/
Sensitivity analysis and experimental evaluation of PID-like continuous sliding mode control
Continuous higher order sliding mode (CHOSM) controllers represent an efficient tool for disturbance rejection. For the systems with relative degree r, CHOSM approaches provide theoretically exact compensation of the matched Lipschitz perturbation, ensuring the finite-time convergence to the (r+1)-th sliding-mode set, by using only information on the sliding output and its derivatives up to the order (r-1). In this paper, we investigate the disturbance rejection properties of a PID-like CHOSM controller, as the simplest and intuitively clear example which incorporates nonlinear actions on the output error, its derivative, and integration of its sign. We use the harmonic balance approach and develop an analysis of propagation of the matched Lipschitz perturbation through the control loop in frequency domain. The resulted solution appears in form of the Bode-like loci which depend also on the amplitude of harmonic disturbances. Such amplitude-frequency characteristics allow certain comparability with standard disturbance sensitivity functions of a linear PID-controlled system in frequency domain. Also a simple and straightforward design procedure for the robust linear PID controller targeting the second-order system plants under investigation is provided for benchmarking. Additional (parasitic) actuator dynamics, which can lead to self-induced steady oscillations, i.e. chattering, is ditto respected. A detailed experimental case study, accomplished on an electro-mechanical actuator in the laboratory setting, highlight and make the pros and cons of both PID and CHOSM controllers well comparable for a broadband disturbance rejection.
comment: 16 pages, 11 figures
Nonlinear integral extension of PID control with improved convergence of perturbed second-order dynamic systems
Nonlinear extension of the integral part of a standard proportional-integral-derivative (PID) feedback control is proposed for the perturbed second-order systems. For the matched constant perturbations, the global asymptotic stability is shown, while for Lipschitz perturbations an ultimately bounded output error is guaranteed. It is shown that the proposed control is also applicable to second-order systems extended by additional (parasitic) actuator dynamics with low-pass characteristics, thus representing a frequently encountered application case. The proposed nonlinear control is proven to outperform its linear PID counterpart during the settling phase, i.e. at convergence of the residual output error. An experimental case study of the second-order system with an additional actuator dynamics and considerable perturbations is demonstrated to confirm and benchmark the control performance.
comment: 12 pages, 9 figures
Battlefield Transfers in Coalitional Blotto Games
In competitive resource allocation environments, agents often choose to form alliances; however, for some agents, doing so may not always be beneficial. Is there a method of forming alliances that always reward each of their members? We study this question using the framework of the coalitional Blotto game, in which two players compete against a common adversary by allocating their budgeted resources across disjoint sets of valued battlefields. On any given battlefield, the agent that allocates a greater amount of resources wins the corresponding battlefield value. Existing work has shown the surprising result that in certain game instances, if one player donates a portion of their budget to the other player, then both players win larger amounts in their separate competitions against the adversary. However, this transfer-based method of alliance formation is not always mutually beneficial, which motivates the search for alternate strategies. In this vein, we study a new method of alliance formation referred to as a joint transfer, whereby players publicly transfer battlefields and budgets between one another before they engage in their separate competitions against the adversary. We show that in almost all game instances, there exists a mutually beneficial joint transfer that strictly increases the payoff of each player.
Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences
Matching algorithms have demonstrated great success in several practical applications, but they often require centralized coordination and plentiful information. In many modern online marketplaces, agents must independently seek out and match with another using little to no information. For these kinds of settings, can we design decentralized, limited-information matching algorithms that preserve the desirable properties of standard centralized techniques? In this work, we constructively answer this question in the affirmative. We model a two-sided matching market as a game consisting of two disjoint sets of agents, referred to as proposers and acceptors, each of whom seeks to match with their most preferable partner on the opposite side of the market. However, each proposer has no knowledge of their own preferences, so they must learn their preferences while forming matches in the market. We present a simple online learning rule that guarantees a strong notion of probabilistic convergence to the welfare-maximizing equilibrium of the game, referred to as the proposer-optimal stable match. To the best of our knowledge, this represents the first completely decoupled, communication-free algorithm that guarantees probabilistic convergence to an optimal stable match, irrespective of the structure of the matching market.
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair
Surface cracks in infrastructure can lead to significant deterioration and costly maintenance if not efficiently repaired. Manual repair methods are labor-intensive, time-consuming, and imprecise and thus difficult to scale to large areas. While advancements in robotic perception and manipulation have progressed autonomous crack repair, existing methods still face three key challenges: accurate localization of cracks within the robot's coordinate frame, (ii) adaptability to varying crack depths and widths, and (iii) validation of the repair process under realistic conditions. This paper presents an adaptive, autonomous system for surface crack detection and repair using robotics with advanced sensing technologies to enhance precision and safety for humans. The system uses an RGB-D camera for crack detection, a laser scanner for precise measurement, and an extruder and pump for material deposition. To address one of the key challenges, the laser scanner is used to enhance the crack coordinates for accurate localization. Furthermore, our approach demonstrates that an adaptive crack-filling method is more efficient and effective than a fixed-speed approach, with experimental results confirming both precision and consistency. In addition, to ensure real-world applicability and testing repeatability, we introduce a novel validation procedure using 3D-printed crack specimens that accurately simulate real-world conditions. This research contributes to the evolving field of human-robot interaction in construction by demonstrating how adaptive robotic systems can reduce the need for manual labor, improve safety, and enhance the efficiency of maintenance operations, ultimately paving the way for more sophisticated and integrated construction robotics.
comment: 22 pages, 14 figures, submitted to Advanced Engineering Informatics
Wireless Resource Optimization in Hybrid Semantic/Bit Communication Networks
Recently, semantic communication (SemCom) has shown great potential in significant resource savings and efficient information exchanges, thus naturally introducing a novel and practical cellular network paradigm where two modes of SemCom and conventional bit communication (BitCom) coexist. Nevertheless, the involved wireless resource management becomes rather complicated and challenging, given the unique background knowledge matching and time-consuming semantic coding requirements in SemCom. To this end, this paper jointly investigates user association (UA), mode selection (MS), and bandwidth allocation (BA) problems in a hybrid semantic/bit communication network (HSB-Net). Concretely, we first identify a unified performance metric of message throughput for both SemCom and BitCom links. Next, we specially develop a knowledge matching-aware two-stage tandem packet queuing model and theoretically derive the average packet loss ratio and queuing latency. Combined with practical constraints, we then formulate a joint optimization problem for UA, MS, and BA to maximize the overall message throughput of HSB-Net. Afterward, we propose an optimal resource management strategy by utilizing a Lagrange primal-dual transformation method and a preference list-based heuristic algorithm with polynomial-time complexity. Numerical results not only demonstrate the accuracy of our analytical queuing model, but also validate the performance superiority of our proposed strategy compared with different benchmarks.
comment: This paper has been accepted for publication by the IEEE Transactions on Communications. Copyright may be transferred without notice, after which this version may no longer be accessible
Active risk aversion in SIS epidemics on networks
We present and analyze an actively controlled Susceptible-Infected-Susceptible (actSIS) model of interconnected populations to study how risk aversion strategies, such as social distancing, affect network epidemics. A population using a risk aversion strategy reduces its contact rate with other populations when it perceives an increase in infection risk. The network actSIS model relies on two distinct networks. One is a physical contact network that defines which populations come into contact with which other populations and thus how infection spreads. The other is a communication network, such as an online social network, that defines which populations observe the infection level of which other populations and thus how information spreads. We prove that the model, with these two networks and populations using risk aversion strategies, exhibits a transcritical bifurcation in which an endemic equilibrium emerges. For regular graphs, we prove that the endemic infection level is uniform across populations and reduced by the risk aversion strategy, relative to the network SIS endemic level. We show that when communication is sufficiently sparse, this initially stable equilibrium loses stability in a secondary bifurcation. Simulations show that a new stable solution emerges with nonuniform infection levels.
Robotics
Contrastive Touch-to-Touch Pretraining
Today's tactile sensors have a variety of different designs, making it challenging to develop general-purpose methods for processing touch signals. In this paper, we learn a unified representation that captures the shared information between different tactile sensors. Unlike current approaches that focus on reconstruction or task-specific supervision, we leverage contrastive learning to integrate tactile signals from two different sensors into a shared embedding space, using a dataset in which the same objects are probed with multiple sensors. We apply this approach to paired touch signals from GelSlim and Soft Bubble sensors. We show that our learned features provide strong pretraining for downstream pose estimation and classification tasks. We also show that our embedding enables models trained using one touch sensor to be deployed using another without additional training. Project details can be found at https://www.mmintlab.com/research/cttp/.
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
In reinforcement learning, off-policy actor-critic approaches like DDPG and TD3 are based on the deterministic policy gradient. Herein, the Q-function is trained from off-policy environment data and the actor (policy) is trained to maximize the Q-function via gradient ascent. We observe that in complex tasks like dexterous manipulation and restricted locomotion, the Q-value is a complex function of action, having several local optima or discontinuities. This poses a challenge for gradient ascent to traverse and makes the actor prone to get stuck at local optima. To address this, we introduce a new actor architecture that combines two simple insights: (i) use multiple actors and evaluate the Q-value maximizing action, and (ii) learn surrogates to the Q-function that are simpler to optimize with gradient-based methods. We evaluate tasks such as restricted locomotion, dexterous manipulation, and large discrete-action space recommender systems and show that our actor finds optimal actions more frequently and outperforms alternate actor architectures.
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained Policies
Reinforcement learning combined with sim-to-real transfer offers a general framework for developing locomotion controllers for legged robots. To facilitate successful deployment in the real world, smoothing techniques, such as low-pass filters and smoothness rewards, are often employed to develop policies with smooth behaviors. However, because these techniques are non-differentiable and usually require tedious tuning of a large set of hyperparameters, they tend to require extensive manual tuning for each robotic platform. To address this challenge and establish a general technique for enforcing smooth behaviors, we propose a simple and effective method that imposes a Lipschitz constraint on a learned policy, which we refer to as Lipschitz-Constrained Policies (LCP). We show that the Lipschitz constraint can be implemented in the form of a gradient penalty, which provides a differentiable objective that can be easily incorporated with automatic differentiation frameworks. We demonstrate that LCP effectively replaces the need for smoothing rewards or low-pass filters and can be easily integrated into training frameworks for many distinct humanoid robots. We extensively evaluate LCP in both simulation and real-world humanoid robots, producing smooth and robust locomotion controllers. All simulation and deployment code, along with complete checkpoints, is available on our project page: https://lipschitz-constrained-policy.github.io.
comment: 8 pages
Adaptive Ankle Torque Control for Bipedal Humanoid Walking on Surfaces with Unknown Horizontal and Vertical Motion
Achieving stable bipedal walking on surfaces with unknown motion remains a challenging control problem due to the hybrid, time-varying, partially unknown dynamics of the robot and the difficulty of accurate state and surface motion estimation. Surface motion imposes uncertainty on both system parameters and non-homogeneous disturbance in the walking robot dynamics. In this paper, we design an adaptive ankle torque controller to simultaneously address these two uncertainties and propose a step-length planner to minimize the required control torque. Typically, an adaptive controller is used for a continuous system. To apply adaptive control on a hybrid system such as a walking robot, an intermediate command profile is introduced to ensure a continuous error system. Simulations on a planar bipedal robot, along with comparisons against a baseline controller, demonstrate that the proposed approach effectively ensures stable walking and accurate tracking under unknown, time-varying disturbances.
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation
We study the problem of teaching humanoid robots manipulation skills by imitating from single video demonstrations. We introduce OKAMI, a method that generates a manipulation plan from a single RGB-D video and derives a policy for execution. At the heart of our approach is object-aware retargeting, which enables the humanoid robot to mimic the human motions in an RGB-D video while adjusting to different object locations during deployment. OKAMI uses open-world vision models to identify task-relevant objects and retarget the body motions and hand poses separately. Our experiments show that OKAMI achieves strong generalizations across varying visual and spatial conditions, outperforming the state-of-the-art baseline on open-world imitation from observation. Furthermore, OKAMI rollout trajectories are leveraged to train closed-loop visuomotor policies, which achieve an average success rate of 79.2% without the need for labor-intensive teleoperation. More videos can be found on our website https://ut-austin-rpl.github.io/OKAMI/.
comment: Accepted for oral presentation at 8th Annual Conference on Robot Learning. Project website: https://ut-austin-rpl.github.io/OKAMI/
Latent BKI: Open-Dictionary Continuous Mapping in Visual-Language Latent Spaces with Quantifiable Uncertainty
This paper introduces a novel probabilistic mapping algorithm, Latent BKI, which enables open-vocabulary mapping with quantifiable uncertainty. Traditionally, semantic mapping algorithms focus on a fixed set of semantic categories which limits their applicability for complex robotic tasks. Vision-Language (VL) models have recently emerged as a technique to jointly model language and visual features in a latent space, enabling semantic recognition beyond a predefined, fixed set of semantic classes. Latent BKI recurrently incorporates neural embeddings from VL models into a voxel map with quantifiable uncertainty, leveraging the spatial correlations of nearby observations through Bayesian Kernel Inference (BKI). Latent BKI is evaluated against similar explicit semantic mapping and VL mapping frameworks on the popular MatterPort-3D and Semantic KITTI data sets, demonstrating that Latent BKI maintains the probabilistic benefits of continuous mapping with the additional benefit of open-dictionary queries. Real-world experiments demonstrate applicability to challenging indoor environments.
Octopus-Swimming-Like Robot with Soft Asymmetric Arms
Underwater vehicles have seen significant development over the past seventy years. However, bio-inspired propulsion robots are still in their early stages and require greater interdisciplinary collaboration between biologists and roboticists. The octopus, one of the most intelligent marine animals, exhibits remarkable abilities such as camouflaging, exploring, and hunting while swimming with its arms. Although bio-inspired robotics researchers have aimed to replicate these abilities, the complexity of designing an eight-arm bionic swimming platform has posed challenges from the beginning. In this work, we propose a novel bionic robot swimming platform that combines asymmetric passive morphing arms with an umbrella-like quick-return mechanism. Using only two simple constant-speed motors, this design achieves efficient swimming by replicating octopus-like arm movements and stroke time ratios. The robot reached a peak speed of 314 mm/s during its second power stroke. This design reduces the complexity of traditional octopus-like swimming robot actuation systems while maintaining good swimming performance. It offers a more achievable and efficient platform for biologists and roboticists conducting more profound octopus-inspired robotic and biological studies.
Latent Action Pretraining from Videos
We introduce Latent Action Pretraining for general Action models (LAPA), an unsupervised method for pretraining Vision-Language-Action (VLA) models without ground-truth robot action labels. Existing Vision-Language-Action models require action labels typically collected by human teleoperators during pretraining, which significantly limits possible data sources and scale. In this work, we propose a method to learn from internet-scale videos that do not have robot action labels. We first train an action quantization model leveraging VQ-VAE-based objective to learn discrete latent actions between image frames, then pretrain a latent VLA model to predict these latent actions from observations and task descriptions, and finally finetune the VLA on small-scale robot manipulation data to map from latent to robot actions. Experimental results demonstrate that our method significantly outperforms existing techniques that train robot manipulation policies from large-scale videos. Furthermore, it outperforms the state-of-the-art VLA model trained with robotic action labels on real-world manipulation tasks that require language conditioning, generalization to unseen objects, and semantic generalization to unseen instructions. Training only on human manipulation videos also shows positive transfer, opening up the potential for leveraging web-scale data for robotics foundation model.
comment: Website: https://latentactionpretraining.github.io
Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers
Effective trajectory generation is essential for reliable on-board spacecraft autonomy. Among other approaches, learning-based warm-starting represents an appealing paradigm for solving the trajectory generation problem, effectively combining the benefits of optimization- and data-driven methods. Current approaches for learning-based trajectory generation often focus on fixed, single-scenario environments, where key scene characteristics, such as obstacle positions or final-time requirements, remain constant across problem instances. However, practical trajectory generation requires the scenario to be frequently reconfigured, making the single-scenario approach a potentially impractical solution. To address this challenge, we present a novel trajectory generation framework that generalizes across diverse problem configurations, by leveraging high-capacity transformer neural networks capable of learning from multimodal data sources. Specifically, our approach integrates transformer-based neural network models into the trajectory optimization process, encoding both scene-level information (e.g., obstacle locations, initial and goal states) and trajectory-level constraints (e.g., time bounds, fuel consumption targets) via multimodal representations. The transformer network then generates near-optimal initial guesses for non-convex optimization problems, significantly enhancing convergence speed and performance. The framework is validated through extensive simulations and real-world experiments on a free-flyer platform, achieving up to 30% cost improvement and 80% reduction in infeasible cases with respect to traditional approaches, and demonstrating robust generalization across diverse scenario variations.
comment: 8 pages, 6 figures, submitted to 2025 American Control Conference (ACC)
Robotic Arm Platform for Multi-View Image Acquisition and 3D Reconstruction in Minimally Invasive Surgery
Minimally invasive surgery (MIS) offers significant benefits such as reduced recovery time and minimised patient trauma, but poses challenges in visibility and access, making accurate 3D reconstruction a significant tool in surgical planning and navigation. This work introduces a robotic arm platform for efficient multi-view image acquisition and precise 3D reconstruction in MIS settings. We adapted a laparoscope to a robotic arm and captured ex-vivo images of several ovine organs across varying lighting conditions (operating room and laparoscopic) and trajectories (spherical and laparoscopic). We employed recently released learning-based feature matchers combined with COLMAP to produce our reconstructions. The reconstructions were evaluated against high-precision laser scans for quantitative evaluation. Our results show that whilst reconstructions suffer most under realistic MIS lighting and trajectory, many versions of our pipeline achieve close to sub-millimetre accuracy with an average of 1.05 mm Root Mean Squared Error and 0.82 mm Chamfer distance. Our best reconstruction results occur with operating room lighting and spherical trajectories. Our robotic platform provides a tool for controlled, repeatable multi-view data acquisition for 3D generation in MIS environments which we hope leads to new datasets for training learning-based models.
comment: 8 pages, 5 figures, 3 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Safety Filtering While Training: Improving the Performance and Sample Efficiency of Reinforcement Learning Agents
Reinforcement learning (RL) controllers are flexible and performant but rarely guarantee safety. Safety filters impart hard safety guarantees to RL controllers while maintaining flexibility. However, safety filters can cause undesired behaviours due to the separation between the controller and the safety filter, often degrading performance and robustness. In this paper, we propose several modifications to incorporating the safety filter in training RL controllers rather than solely applying it during evaluation. The modifications allow the RL controller to learn to account for the safety filter, improving performance. Additionally, our modifications significantly improve sample efficiency and eliminate training-time constraint violations. We verified the proposed modifications in simulated and real experiments with a Crazyflie 2.0 drone. In experiments, we show that the proposed training approaches require significantly fewer environment interactions and improve performance by up to 20% compared to standard RL training.
comment: 8 pages, 9 figures. Code is publicly available at https://github.com/Federico-PizarroBejarano/safe-control-gym/tree/training_rl_paper
Robust Manipulation Primitive Learning via Domain Contraction
Contact-rich manipulation plays an important role in human daily activities, but uncertain parameters pose significant challenges for robots to achieve comparable performance through planning and control. To address this issue, domain adaptation and domain randomization have been proposed for robust policy learning. However, they either lose the generalization ability across diverse instances or perform conservatively due to neglecting instance-specific information. In this paper, we propose a bi-level approach to learn robust manipulation primitives, including parameter-augmented policy learning using multiple models, and parameter-conditioned policy retrieval through domain contraction. This approach unifies domain randomization and domain adaptation, providing optimal behaviors while keeping generalization ability. We validate the proposed method on three contact-rich manipulation primitives: hitting, pushing, and reorientation. The experimental results showcase the superior performance of our approach in generating robust policies for instances with diverse physical parameters.
comment: Conference on Robot Learning (CoRL), 2024
DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment
In recent years, imitation learning has made progress in the field of robotic manipulation. However, it still faces challenges when dealing with complex long-horizon deformable object tasks, such as high-dimensional state spaces, complex dynamics, and multimodal action distributions. Traditional imitation learning methods often require a large amount of data and encounter distributional shifts and accumulative errors in these tasks. To address these issues, we propose a data-efficient general learning framework (DeformPAM) based on preference learning and reward-guided action selection. DeformPAM decomposes long-horizon tasks into multiple action primitives, utilizes 3D point cloud inputs and diffusion models to model action distributions, and trains an implicit reward model using human preference data. During the inference phase, the reward model scores multiple candidate actions, selecting the optimal action for execution, thereby reducing the occurrence of anomalous actions and improving task completion quality. Experiments conducted on three challenging real-world long-horizon deformable object manipulation tasks demonstrate the effectiveness of this method. Results show that DeformPAM improves both task completion quality and efficiency compared to baseline methods even with limited data. Code and data will be available at https://deform-pam.robotflow.ai.
SDS -- See it, Do it, Sorted: Quadruped Skill Synthesis from Single Video Demonstration
In this paper, we present SDS (``See it. Do it. Sorted.''), a novel pipeline for intuitive quadrupedal skill learning from a single demonstration video. Leveraging the Visual capabilities of GPT-4o, SDS processes input videos through our novel chain-of-thought promoting technique (SUS) and generates executable reward functions (RFs) that drive the imitation of locomotion skills, through learning a Proximal Policy Optimization (PPO)-based Reinforcement Learning (RL) policy, using environment information from the NVIDIA IsaacGym simulator. SDS autonomously evaluates the RFs by monitoring the individual reward components and supplying training footage and fitness metrics back into GPT-4o, which is then prompted to evolve the RFs to achieve higher task fitness at each iteration. We validate our method on the Unitree Go1 robot, demonstrating its ability to execute variable skills such as trotting, bounding, pacing and hopping, achieving high imitation fidelity and locomotion stability. SDS shows improvements over SOTA methods in task adaptability, reduced dependence on domain-specific knowledge, and bypassing the need for labor-intensive reward engineering and large-scale training datasets. Additional information and the open-sourced code can be found in: https://rpl-cs-ucl.github.io/SDSweb
A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction
The development of autonomous driving has boosted the research on autonomous racing. However, existing local trajectory planning methods have difficulty planning trajectories with optimal velocity profiles at racetracks with sharp corners, thus weakening the performance of autonomous racing. To address this problem, we propose a local trajectory planning method that integrates Velocity Prediction based on Model Predictive Contour Control (VPMPCC). The optimal parameters of VPMPCC are learned through Bayesian Optimization (BO) based on a proposed novel Objective Function adapted to Racing (OFR). Specifically, VPMPCC achieves velocity prediction by encoding the racetrack as a reference velocity profile and incorporating it into the optimization problem. This method optimizes the velocity profile of local trajectories, especially at corners with significant curvature. The proposed OFR balances racing performance with vehicle safety, ensuring safe and efficient BO training. In the simulation, the number of training iterations for OFR-based BO is reduced by 42.86% compared to the state-of-the-art method. The optimal simulation-trained parameters are then applied to a real-world F1TENTH vehicle without retraining. During prolonged racing on a custom-built racetrack featuring significant sharp corners, the mean velocity of VPMPCC reaches 93.18% of the vehicle's handling limits. The released code is available at https://github.com/zhouhengli/VPMPCC.
PAVLM: Advancing Point Cloud based Affordance Understanding Via Vision-Language Model
Affordance understanding, the task of identifying actionable regions on 3D objects, plays a vital role in allowing robotic systems to engage with and operate within the physical world. Although Visual Language Models (VLMs) have excelled in high-level reasoning and long-horizon planning for robotic manipulation, they still fall short in grasping the nuanced physical properties required for effective human-robot interaction. In this paper, we introduce PAVLM (Point cloud Affordance Vision-Language Model), an innovative framework that utilizes the extensive multimodal knowledge embedded in pre-trained language models to enhance 3D affordance understanding of point cloud. PAVLM integrates a geometric-guided propagation module with hidden embeddings from large language models (LLMs) to enrich visual semantics. On the language side, we prompt Llama-3.1 models to generate refined context-aware text, augmenting the instructional input with deeper semantic cues. Experimental results on the 3D-AffordanceNet benchmark demonstrate that PAVLM outperforms baseline methods for both full and partial point clouds, particularly excelling in its generalization to novel open-world affordance tasks of 3D objects. For more information, visit our project site: pavlm-source.github.io.
LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images
Visual localization involves estimating a query image's 6-DoF (degrees of freedom) camera pose, which is a fundamental component in various computer vision and robotic tasks. This paper presents LoGS, a vision-based localization pipeline utilizing the 3D Gaussian Splatting (GS) technique as scene representation. This novel representation allows high-quality novel view synthesis. During the mapping phase, structure-from-motion (SfM) is applied first, followed by the generation of a GS map. During localization, the initial position is obtained through image retrieval, local feature matching coupled with a PnP solver, and then a high-precision pose is achieved through the analysis-by-synthesis manner on the GS map. Experimental results on four large-scale datasets demonstrate the proposed approach's SoTA accuracy in estimating camera poses and robustness under challenging few-shot conditions.
comment: 8 pages
NavTopo: Leveraging Topological Maps For Autonomous Navigation Of a Mobile Robot
Autonomous navigation of a mobile robot is a challenging task which requires ability of mapping, localization, path planning and path following. Conventional mapping methods build a dense metric map like an occupancy grid, which is affected by odometry error accumulation and consumes a lot of memory and computations in large environments. Another approach to mapping is the usage of topological properties, e.g. adjacency of locations in the environment. Topological maps are less prone to odometry error accumulation and high resources consumption, and also enable fast path planning because of the graph sparsity. Based on this idea, we proposed NavTopo - a full navigation pipeline based on topological map and two-level path planning. The pipeline localizes in the graph by matching neural network descriptors and 2D projections of the input point clouds, which significantly reduces memory consumption compared to metric and topological point cloud-based approaches. We test our approach in a large indoor photo-relaistic simulated environment and compare it to a metric map-based approach based on popular metric mapping method RTAB-MAP. The experimental results show that our topological approach significantly outperforms the metric one in terms of performance, keeping proper navigational efficiency.
comment: This paper is published in proceedings of the 9th International Conference "Interactive Collaborative Robotics" (ICR 2024)
M2Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes
Recent advances in diffusion models have opened new avenues for research into embodied AI agents and robotics. Despite significant achievements in complex robotic locomotion and skills, mobile manipulation-a capability that requires the coordination of navigation and manipulation-remains a challenge for generative AI techniques. This is primarily due to the high-dimensional action space, extended motion trajectories, and interactions with the surrounding environment. In this paper, we introduce M2Diffuser, a diffusion-based, scene-conditioned generative model that directly generates coordinated and efficient whole-body motion trajectories for mobile manipulation based on robot-centric 3D scans. M2Diffuser first learns trajectory-level distributions from mobile manipulation trajectories provided by an expert planner. Crucially, it incorporates an optimization module that can flexibly accommodate physical constraints and task objectives, modeled as cost and energy functions, during the inference process. This enables the reduction of physical violations and execution errors at each denoising step in a fully differentiable manner. Through benchmarking on three types of mobile manipulation tasks across over 20 scenes, we demonstrate that M2Diffuser outperforms state-of-the-art neural planners and successfully transfers the generated trajectories to a real-world robot. Our evaluations underscore the potential of generative AI to enhance the generalization of traditional planning and learning-based robotic methods, while also highlighting the critical role of enforcing physical constraints for safe and robust execution.
LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs NeurIPS 2024
Robot swarms are composed of many simple robots that communicate and collaborate to fulfill complex tasks. Robot controllers usually need to be specified by experts on a case-by-case basis via programming code. This process is time-consuming, prone to errors, and unable to take into account all situations that may be encountered during deployment. On the other hand, recent Large Language Models (LLMs) have demonstrated reasoning and planning capabilities, introduced new ways to interact with and program machines, and represent domain and commonsense knowledge. Hence, we propose to address the aforementioned challenges by integrating LLMs with robot swarms and show the potential in proofs of concept (showcases). For this integration, we explore two approaches. The first approach is 'indirect integration,' where LLMs are used to synthesize and validate the robot controllers. This approach may reduce development time and human error before deployment. Moreover, during deployment, it could be used for on-the-fly creation of new robot behaviors. The second approach is 'direct integration,' where each robot locally executes a separate LLM instance during deployment for robot-robot collaboration and human-swarm interaction. These local LLM instances enable each robot to reason, plan, and collaborate using natural language. To enable further research on our mainly conceptual contribution, we release the software and videos for our LLM2Swarm system: https://github.com/Pold87/LLM2Swarm.
comment: Accepted at NeurIPS 2024 Workshop on Open-World Agents
Towards Local Minima-free Robotic Navigation: Model Predictive Path Integral Control via Repulsive Potential Augmentation
Model-based control is a crucial component of robotic navigation. However, it often struggles with entrapment in local minima due to its inherent nature as a finite, myopic optimization procedure. Previous studies have addressed this issue but sacrificed either solution quality due to their reactive nature or computational efficiency in generating explicit paths for proactive guidance. To this end, we propose a motion planning method that proactively avoids local minima without any guidance from global paths. The key idea is repulsive potential augmentation, integrating high-level directional information into the Model Predictive Path Integral control as a single repulsive term through an artificial potential field. We evaluate our method through theoretical analysis and simulations in environments with obstacles that induce local minima. Results show that our method guarantees the avoidance of local minima and outperforms existing methods in terms of global optimality without decreasing computational efficiency.
comment: 7pages, 8 figures, Under review for IEEE/SICE International Symposium on System Integration, 2025
A Framework for Adapting Human-Robot Interaction to Diverse User Groups
To facilitate natural and intuitive interactions with diverse user groups in real-world settings, social robots must be capable of addressing the varying requirements and expectations of these groups while adapting their behavior based on user feedback. While previous research often focuses on specific demographics, we present a novel framework for adaptive Human-Robot Interaction (HRI) that tailors interactions to different user groups and enables individual users to modulate interactions through both minor and major interruptions. Our primary contributions include the development of an adaptive, ROS-based HRI framework with an open-source code base. This framework supports natural interactions through advanced speech recognition and voice activity detection, and leverages a large language model (LLM) as a dialogue bridge. We validate the efficiency of our framework through module tests and system trials, demonstrating its high accuracy in age recognition and its robustness to repeated user inputs and plan changes.
comment: Accepted at the 16th International Conference on Social Robotics (ICSR) 2024
DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting
Advancements in reinforcement learning have led to the development of sophisticated models capable of learning complex decision-making tasks. However, efficiently integrating world models with decision transformers remains a challenge. In this paper, we introduce a novel approach that combines the Dreamer algorithm's ability to generate anticipatory trajectories with the adaptive learning strengths of the Online Decision Transformer. Our methodology enables parallel training where Dreamer-produced trajectories enhance the contextual decision-making of the transformer, creating a bidirectional enhancement loop. We empirically demonstrate the efficacy of our approach on a suite of challenging benchmarks, achieving notable improvements in sample efficiency and reward maximization over existing methods. Our results indicate that the proposed integrated framework not only accelerates learning but also showcases robustness in diverse and dynamic scenarios, marking a significant step forward in model-based reinforcement learning.
GSORB-SLAM: Gaussian Splatting SLAM benefits from ORB features and Transmittance information
The emergence of 3D Gaussian Splatting (3DGS) has recently sparked a renewed wave of dense visual SLAM research. However, current methods face challenges such as sensitivity to artifacts and noise, sub-optimal selection of training viewpoints, and a lack of light global optimization. In this paper, we propose a dense SLAM system that tightly couples 3DGS with ORB features. We design a joint optimization approach for robust tracking and effectively reducing the impact of noise and artifacts. This involves combining novel geometric observations, derived from accumulated transmittance, with ORB features extracted from pixel data. Furthermore, to improve mapping quality, we propose an adaptive Gaussian expansion and regularization method that enables Gaussian primitives to represent the scene compactly. This is coupled with a viewpoint selection strategy based on the hybrid graph to mitigate over-fitting effects and enhance convergence quality. Finally, our approach achieves compact and high-quality scene representations and accurate localization. GSORB-SLAM has been evaluated on different datasets, demonstrating outstanding performance. The code will be available.
Visual Manipulation with Legs
Animals use limbs for both locomotion and manipulation. We aim to equip quadruped robots with similar versatility. This work introduces a system that enables quadruped robots to interact with objects using their legs, inspired by non-prehensile manipulation. The system has two main components: a visual manipulation policy module and a loco-manipulator module. The visual manipulation policy, trained with reinforcement learning (RL) using point cloud observations and object-centric actions, decides how the leg should interact with the object. The loco-manipulator controller manages leg movements and body pose adjustments, based on impedance control and Model Predictive Control (MPC). Besides manipulating objects with a single leg, the system can select from the left or right leg based on critic maps and move objects to distant goals through base adjustment. Experiments evaluate the system on object pose alignment tasks in both simulation and the real world, demonstrating more versatile object manipulation skills with legs than previous work.
Using Zone Inflation and Volume Transfer to Design a Fabric-based Pneumatic Exosuit with both Efficiency and Wearability
Fabric-based pneumatic exosuits have a broad application prospect due to their good human-machine interaction performance, but their structural design paradigm has not yet been finalized and requires in-depth research. This paper proposes the concepts of zone inflation and volume transfer for the design of a fabric-based pneumatic exosuit with both efficiency and wearability. The meaning of zone inflation is to divide the inflation area of pneumatic exosuit into inflation-deflation zone and inflation-holding zone which can reduce the consumption of compressed air and improve efficiency. Volume transfer, a strategic distribution method of inflatable regions inside the garment, can effectively enhance the wearability of the exosuit. Using inexpensive thermoplastic polyurethane film and clothing fabric, the exosuit is made by heat pressing and sewing. The exosuit has a response time of 0.5s, a stress area of 1500mm2, and a profile of only 32mm, which can be hidden inside common clothing. A mathematical model is developed to predict the output torque of the exosuit with an error of 3.6%. Mechanical experiments show that the exosuit outputs a torque of 9.1Nm at a pressure of 100kPa. Surface electromyography experiments show that the exosuit can provide users with a boost from sitting to standing, with an average reduction in electromyography signals of 14.95%. The exosuit designed using these methods synthesizes efficiency and wearability and is expected to be an ideal paradigm for fabric-based pneumatic exosuits.
DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation
We propose a novel offline reinforcement learning (offline RL) approach, introducing the Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation (DIAR) framework. We address two key challenges in offline RL: out-of-distribution samples and long-horizon problems. We leverage diffusion models to learn state-action sequence distributions and incorporate value functions for more balanced and adaptive decision-making. DIAR introduces an Adaptive Revaluation mechanism that dynamically adjusts decision lengths by comparing current and future state values, enabling flexible long-term decision-making. Furthermore, we address Q-value overestimation by combining Q-network learning with a value function guided by a diffusion model. The diffusion model generates diverse latent trajectories, enhancing policy robustness and generalization. As demonstrated in tasks like Maze2D, AntMaze, and Kitchen, DIAR consistently outperforms state-of-the-art algorithms in long-horizon, sparse-reward environments.
comment: Preprint, under review. Comments welcome
Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning NeurIPS2024
A hallmark of intelligent agents is the ability to learn reusable skills purely from unsupervised interaction with the environment. However, existing unsupervised skill discovery methods often learn entangled skills where one skill variable simultaneously influences many entities in the environment, making downstream skill chaining extremely challenging. We propose Disentangled Unsupervised Skill Discovery (DUSDi), a method for learning disentangled skills that can be efficiently reused to solve downstream tasks. DUSDi decomposes skills into disentangled components, where each skill component only affects one factor of the state space. Importantly, these skill components can be concurrently composed to generate low-level actions, and efficiently chained to tackle downstream tasks through hierarchical Reinforcement Learning. DUSDi defines a novel mutual-information-based objective to enforce disentanglement between the influences of different skill components, and utilizes value factorization to optimize this objective efficiently. Evaluated in a set of challenging environments, DUSDi successfully learns disentangled skills, and significantly outperforms previous skill discovery methods when it comes to applying the learned skills to solve downstream tasks. Code and skills visualization at jiahenghu.github.io/DUSDi-site/.
comment: NeurIPS2024
Biologically Inspired Swarm Dynamic Target Tracking and Obstacle Avoidance
This study proposes a novel artificial intelligence (AI) driven flight computer, integrating an online free-retraining-prediction model, a swarm control, and an obstacle avoidance strategy, to track dynamic targets using a distributed drone swarm for military applications. To enable dynamic target tracking the swarm requires a trajectory prediction capability to achieve intercept allowing for the tracking of rapid maneuvers and movements while maintaining efficient path planning. Traditional predicative methods such as curve fitting or Long ShortTerm Memory (LSTM) have low robustness and struggle with dynamic target tracking in the short term due to slow convergence of single agent-based trajectory prediction and often require extensive offline training or tuning to be effective. Consequently, this paper introduces a novel robust adaptive bidirectional fuzzy brain emotional learning prediction (BFBEL-P) methodology to address these challenges. The controller integrates a fuzzy interface, a neural network enabling rapid adaption, predictive capability and multi-agent solving enabling multiple solutions to be aggregated to achieve rapid convergence times and high accuracy in both the short and long term. This was verified through the use of numerical simulations seeing complex trajectory being predicted and tracked by a swarm of drones. These simulations show improved adaptability and accuracy to state of the art methods in the short term and strong results over long time domains, enabling accurate swarm target tracking and predictive capability.
comment: 18pages, 33 figures
Routing and Scheduling Optimization for Urban Air Mobility Fleet Management using Quantum Annealing
The growing integration of urban air mobility (UAM) for urban transportation and delivery has accelerated due to increasing traffic congestion and its environmental and economic repercussions. Efficiently managing the anticipated high-density air traffic in cities is critical to ensure safe and effective operations. In this study, we propose a routing and scheduling framework to address the needs of a large fleet of UAM vehicles operating in urban areas. Using mathematical optimization techniques, we plan efficient and deconflicted routes for a fleet of vehicles. Formulating route planning as a maximum weighted independent set problem enables us to utilize various algorithms and specialized optimization hardware, such as quantum annealers, which has seen substantial progress in recent years. Our method is validated using a traffic management simulator tailored for the airspace in Singapore. Our approach enhances airspace utilization by distributing traffic throughout a region. This study broadens the potential applications of optimization techniques in UAM traffic management.
Self-Supervised Learning For Robust Robotic Grasping In Dynamic Environment
Some of the threats in the dynamic environment include the unpredictability of the motion of objects and interferences to the robotic grasp. In such conditions the traditional supervised and reinforcement learning approaches are ill suited because they rely on a large amount of labelled data and a predefined reward signal. More specifically in this paper we introduce an important and promising framework known as self supervised learning (SSL) whose goal is to apply to the RGBD sensor and proprioceptive data from robot hands in order to allow robots to learn and improve their grasping strategies in real time. The invariant SSL framework overcomes the deficiencies of the fixed labelling by adapting the SSL system to changes in the objects behavior and improving performance in dynamic situations. The above proposed method was tested through various simulations and real world trials, with the series obtaining enhanced grasp success rates of 15% over other existing methods, especially under dynamic scenarios. Also, having tested for adaptation times, it was confirmed that the system could adapt faster, thus applicable for use in the real world, such as in industrial automation and service robotics. In future work, the proposed approach will be expanded to more complex tasks, such as multi object manipulation and functions in the context of cluttered environments, in order to apply the proposed methodology to a broader range of robotic tasks.
RPCBF: Constructing Safety Filters Robust to Model Error and Disturbances via Policy Control Barrier Functions ICRA 2025
Control Barrier Functions (CBFs) have proven to be an effective tool for performing safe control synthesis for nonlinear systems. However, guaranteeing safety in the presence of disturbances and input constraints for high relative degree systems is a difficult problem. In this work, we propose the Robust Policy CBF (RPCBF), a practical method of constructing CBF approximations that is easy to implement and robust to disturbances via the estimation of a value function. We demonstrate the effectiveness of our method in simulation on a variety of high relative degree input-constrained systems. Finally, we demonstrate the benefits of RPCBF in compensating for model errors on a hardware quadcopter platform by treating the model errors as disturbances. The project page can be found at https://oswinso.xyz/rpcbf.
comment: Submitted to ICRA 2025. The project page can be found at https://oswinso.xyz/rpcbf
Motion Planning for Automata-based Objectives using Efficient Gradient-based Methods IROS 2024
In recent years, there has been increasing interest in using formal methods-based techniques to safely achieve temporal tasks, such as timed sequence of goals, or patrolling objectives. Such tasks are often expressed in real-time logics such as Signal Temporal Logic (STL), whereby, the logical specification is encoded into an optimization problem. Such approaches usually involve optimizing over the quantitative semantics, or robustness degree, of the logic over bounded horizons: the semantics can be encoded as mixed-integer linear constraints or into smooth approximations of the robustness degree. A major limitation of this approach is that it faces scalability challenges with respect to temporal complexity: for example, encoding long-term tasks requires storing the entire history of the system. In this paper, we present a quantitative generalization of such tasks in the form of symbolic automata objectives. Specifically, we show that symbolic automata can be expressed as matrix operators that lend themselves to automatic differentiation, allowing for the use of off-the-shelf gradient-based optimizers. We show how this helps solve the need to store arbitrarily long system trajectories, while efficiently leveraging the task structure encoded in the automaton.
comment: The paper has been accepted to IROS 2024
Latent-Predictive Empowerment: Measuring Empowerment without a Simulator
Empowerment has the potential to help agents learn large skillsets, but is not yet a scalable solution for training general-purpose agents. Recent empowerment methods learn diverse skillsets by maximizing the mutual information between skills and states; however, these approaches require a model of the transition dynamics, which can be challenging to learn in realistic settings with high-dimensional and stochastic observations. We present Latent-Predictive Empowerment (LPE), an algorithm that can compute empowerment in a more practical manner. LPE learns large skillsets by maximizing an objective that is a principled replacement for the mutual information between skills and states and that only requires a simpler latent-predictive model rather than a full simulator of the environment. We show empirically in a variety of settings--including ones with high-dimensional observations and highly stochastic transition dynamics--that our empowerment objective (i) learns similar-sized skillsets as the leading empowerment algorithm that assumes access to a model of the transition dynamics and (ii) outperforms other model-based approaches to empowerment.
Affordance-Centric Policy Learning: Sample Efficient and Generalisable Robot Policy Learning using Affordance-Centric Task Frames
Affordances are central to robotic manipulation, where most tasks can be simplified to interactions with task-specific regions on objects. By focusing on these key regions, we can abstract away task-irrelevant information, simplifying the learning process, and enhancing generalisation. In this paper, we propose an affordance-centric policy-learning approach that centres and appropriately \textit{orients} a \textit{task frame} on these affordance regions allowing us to achieve both \textbf{intra-category invariance} -- where policies can generalise across different instances within the same object category -- and \textbf{spatial invariance} -- which enables consistent performance regardless of object placement in the environment. We propose a method to leverage existing generalist large vision models to extract and track these affordance frames, and demonstrate that our approach can learn manipulation tasks using behaviour cloning from as little as 10 demonstrations, with equivalent generalisation to an image-based policy trained on 305 demonstrations. We provide video demonstrations on our project site: https://affordance-policy.github.io.
comment: Video can be found on our project website: https://affordance-policy.github.io
A Novel Twisted-Winching String Actuator for Robotic Applications: Design and Validation
This paper presents a novel actuator system combining a twisted string actuator (TSA) with a winch mechanism. Relative to traditional hydraulic and pneumatic systems in robotics, TSAs are compact and lightweight but face limitations in stroke length and force-transmission ratios. Our integrated TSA-winch system overcomes these constraints by providing variable transmission ratios through dynamic adjustment. It increases actuator stroke by winching instead of overtwisting, and it improves force output by twisting. The design features a rotating turret that houses a winch, which is mounted on a bevel gear assembly driven by a through-hole drive shaft. Mathematical models are developed for the combined displacement and velocity control of this system. Experimental validation demonstrates the actuator's ability to achieve a wide range of transmission ratios and precise movement control. We present performance data on movement precision and generated forces, discussing the results in the context of existing literature. This research contributes to the development of more versatile and efficient actuation systems for advanced robotic applications and improved automation solutions.
comment: 7 pages 11 figures, submitted to 2025 IEEE International Conference on Robotics & Automation
V3D-SLAM: Robust RGB-D SLAM in Dynamic Environments with 3D Semantic Geometry Voting
Simultaneous localization and mapping (SLAM) in highly dynamic environments is challenging due to the correlation complexity between moving objects and the camera pose. Many methods have been proposed to deal with this problem; however, the moving properties of dynamic objects with a moving camera remain unclear. Therefore, to improve SLAM's performance, minimizing disruptive events of moving objects with a physical understanding of 3D shapes and dynamics of objects is needed. In this paper, we propose a robust method, V3D-SLAM, to remove moving objects via two lightweight re-evaluation stages, including identifying potentially moving and static objects using a spatial-reasoned Hough voting mechanism and refining static objects by detecting dynamic noise caused by intra-object motions using Chamfer distances as similarity measurements. Our experiment on the TUM RGB-D benchmark on dynamic sequences with ground-truth camera trajectories showed that our methods outperform the most recent state-of-the-art SLAM methods. Our source code is available at https://github.com/tuantdang/v3d-slam.
MFC-EQ: Mean-Field Control with Envelope Q-Learning for Moving Decentralized Agents in Formation IROS 2024
We study a decentralized version of Moving Agents in Formation (MAiF), a variant of Multi-Agent Path Finding aiming to plan collision-free paths for multiple agents with the dual objectives of reaching their goals quickly while maintaining a desired formation. The agents must balance these objectives under conditions of partial observation and limited communication. The formation maintenance depends on the joint state of all agents, whose dimensionality increases exponentially with the number of agents, rendering the learning process intractable. Additionally, learning a single policy that can accommodate different linear preferences for these two objectives presents a significant challenge. In this paper, we propose Mean-Field Control with Envelop $Q$-learning (MFC-EQ), a scalable and adaptable learning framework for this bi-objective multi-agent problem. We approximate the dynamics of all agents using mean-field theory while learning a universal preference-agnostic policy through envelop $Q$-learning. Our empirical evaluation of MFC-EQ across numerous instances shows that it outperforms state-of-the-art centralized MAiF baselines. Furthermore, MFC-EQ effectively handles more complex scenarios where the desired formation changes dynamically -- a challenge that existing MAiF planners cannot address.
comment: Accepted to IROS 2024
A Lyapunov-Based Switching Scheme for Selecting the Stable Closed-Loop Fixed Attitude-Error Quaternion During Flight
We present a switching scheme, which uses both the attitude-error quaternion (AEQ) and the angular-velocity error, for controlling the rotational degrees of freedom of an uncrewed aerial vehicle (UAV) during flight. In this approach, the proposed controller continually selects the stable closed-loop (CL) equilibrium AEQ corresponding to the smallest cost between those computed with two energy-based Lyapunov functions. To analyze and enforce the stability of the CL switching dynamics, we use basic nonlinear theory. This research problem is relevant because the selection of the stable CL equilibrium AEQ directly determines the power and energy requirements of the controlled UAV during flight. To test and demonstrate the implementation, suitability, functionality, and performance of the proposed approach, we present experimental results obtained using a 31-gram quadrotor, which was controlled to execute high-speed yaw maneuvers in flight. These flight tests show that the proposed switching controller can respectively reduce the control effort and rotational power by as much as 49.75 % and 28.14 %, on average, compared to those corresponding to an often-used benchmark controller.
comment: 8 pages, 5 figures, 2024 7th Iberian Robotics Conference (ROBOT)
Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation
Enabling mobile robots to perform long-term tasks in dynamic real-world environments is a formidable challenge, especially when the environment changes frequently due to human-robot interactions or the robot's own actions. Traditional methods typically assume static scenes, which limits their applicability in the continuously changing real world. To overcome these limitations, we present DovSG, a novel mobile manipulation framework that leverages dynamic open-vocabulary 3D scene graphs and a language-guided task planning module for long-term task execution. DovSG takes RGB-D sequences as input and utilizes vision-language models (VLMs) for object detection to obtain high-level object semantic features. Based on the segmented objects, a structured 3D scene graph is generated for low-level spatial relationships. Furthermore, an efficient mechanism for locally updating the scene graph, allows the robot to adjust parts of the graph dynamically during interactions without the need for full scene reconstruction. This mechanism is particularly valuable in dynamic environments, enabling the robot to continually adapt to scene changes and effectively support the execution of long-term tasks. We validated our system in real-world environments with varying degrees of manual modifications, demonstrating its effectiveness and superior performance in long-term tasks. Our project page is available at: https://BJHYZJ.github.io/DoviSG.
comment: 8 pages, 5 figures
An Online Self-learning Graph-based Lateral Controller for Self-Driving Cars
The hype around self-driving cars has been growing over the past years and has sparked much research. Several modules in self-driving cars are thoroughly investigated to ensure safety, comfort, and efficiency, among which the controller is crucial. The controller module can be categorized into longitudinal and lateral controllers in which the task of the former is to follow the reference velocity, and the latter is to reduce the lateral displacement error from the reference path. Generally, a tuned controller is not sufficient to perform in all environments. Thus, a controller that can adapt to changing conditions is necessary for autonomous driving. Furthermore, these controllers often depend on vehicle models that also need to adapt over time due to varying environments. This paper uses graphs to present novel techniques to learn the vehicle model and the lateral controller online. First, a heterogeneous graph is presented depicting the current states of and inputs to the vehicle. The vehicle model is then learned online using known physical constraints in conjunction with the processing of the graph through a Graph Neural Network structure. Next, another heterogeneous graph - depicting the transition from current to desired states - is processed through another Graph Neural Network structure to generate the steering command on the fly. Finally, the performance of this self-learning model-based lateral controller is evaluated and shown to be satisfactory on an open-source autonomous driving platform called CARLA.
comment: The article has been published in the early access area on IEEE Xplore for the IEEE Transactions on Intelligent Vehicles (2024). This is the accepted version. Number of pages: 12 pages, Number of figures: 10
Making a Complete Mess and Getting Away with it: Traveling Salesperson Problems with Circle Placement Variants
This paper explores a variation of the Traveling Salesperson Problem, where the agent places a circular obstacle next to each node once it visits it. Referred to as the Traveling Salesperson Problem with Circle Placement (TSP-CP), the aim is to maximize the obstacle radius for which a valid closed tour exists and then minimize the tour cost. The TSP-CP finds relevance in various real-world applications, such as harvesting, quarrying, and open-pit mining. We propose several novel solvers to address the TSP-CP, its variant tailored for Dubins vehicles, and a crucial subproblem known as the Traveling Salesperson Problem on self-deleting graphs (TSP-SD). Our extensive experimental results show that the proposed solvers outperform the current state-of-the-art on related problems in solution quality.
comment: 8 pages, 7 figures, accepted to IEEE Robotics and Automation Letters in August 2024
Autonomous Improvement of Instruction Following Skills via Foundation Models
Intelligent instruction-following robots capable of improving from autonomously collected experience have the potential to transform robot learning: instead of collecting costly teleoperated demonstration data, large-scale deployment of fleets of robots can quickly collect larger quantities of autonomous data that can collectively improve their performance. However, autonomous improvement requires solving two key problems: (i) fully automating a scalable data collection procedure that can collect diverse and semantically meaningful robot data and (ii) learning from non-optimal, autonomous data with no human annotations. To this end, we propose a novel approach that addresses these challenges, allowing instruction-following policies to improve from autonomously collected data without human supervision. Our framework leverages vision-language models to collect and evaluate semantically meaningful experiences in new environments, and then utilizes a decomposition of instruction following tasks into (semantic) language-conditioned image generation and (non-semantic) goal reaching, which makes it significantly more practical to improve from this autonomously collected data without any human annotations. We carry out extensive experiments in the real world to demonstrate the effectiveness of our approach, and find that in a suite of unseen environments, the robot policy can be improved 2x with autonomously collected data. We open-source the code for our semantic autonomous improvement pipeline, as well as our autonomous dataset of 30.5K trajectories collected across five tabletop environments.
comment: 2024 Conference on Robot Learning (CoRL)
LoRD: Adapting Differentiable Driving Policies to Distribution Shifts
Distribution shifts between operational domains can severely affect the performance of learned models in self-driving vehicles (SDVs). While this is a well-established problem, prior work has mostly explored naive solutions such as fine-tuning, focusing on the motion prediction task. In this work, we explore novel adaptation strategies for differentiable autonomy stacks consisting of prediction, planning, and control, perform evaluation in closed-loop, and investigate the often-overlooked issue of catastrophic forgetting. Specifically, we introduce two simple yet effective techniques: a low-rank residual decoder (LoRD) and multi-task fine-tuning. Through experiments across three models conducted on two real-world autonomous driving datasets (nuPlan, exiD), we demonstrate the effectiveness of our methods and highlight a significant performance gap between open-loop and closed-loop evaluation in prior approaches. Our approach improves forgetting by up to 23.33% and the closed-loop OOD driving score by 8.83% in comparison to standard fine-tuning.
comment: Under Review
DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Grasping with Geometric Fabrics
A pivotal challenge in robotics is achieving fast, safe, and robust dexterous grasping across a diverse range of objects, an important goal within industrial applications. However, existing methods often have very limited speed, dexterity, and generality, along with limited or no hardware safety guarantees. In this work, we introduce DextrAH-G, a depth-based dexterous grasping policy trained entirely in simulation that combines reinforcement learning, geometric fabrics, and teacher-student distillation. We address key challenges in joint arm-hand policy learning, such as high-dimensional observation and action spaces, the sim2real gap, collision avoidance, and hardware constraints. DextrAH-G enables a 23 motor arm-hand robot to safely and continuously grasp and transport a large variety of objects at high speed using multi-modal inputs including depth images, allowing generalization across object geometry. Videos at https://sites.google.com/view/dextrah-g.
Prompt a Robot to Walk with Large Language Models
Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains. Recently, there has been escalating interest in deploying LLMs for robotics, aiming to harness the power of foundation models in real-world settings. However, this approach faces significant challenges, particularly in grounding these models in the physical world and in generating dynamic robot motions. To address these issues, we introduce a novel paradigm in which we use few-shot prompts collected from the physical environment, enabling the LLM to autoregressively generate low-level control commands for robots without task-specific fine-tuning. Experiments across various robots and environments validate that our method can effectively prompt a robot to walk. We thus illustrate how LLMs can proficiently function as low-level feedback controllers for dynamic motion control even in high-dimensional robotic systems. The project website and source code can be found at: https://prompt2walk.github.io/ .
comment: Conference on Decision and Control (CDC), 2024
LAP, Using Action Feasibility for Improved Uncertainty Alignment of Large Language Model Planners
Large language models (LLMs) showcase many desirable traits for intelligent and helpful robots. However, they are also known to hallucinate predictions. This issue is exacerbated in robotics where LLM hallucinations may result in robots confidently executing plans that are contrary to user goals, relying more frequently on human assistance, or preventing the robot from asking for help at all. In this work, we present LAP, a novel approach for utilizing off-the-shelf LLMs, alongside a novel Action feasibility metric, in robotic Planners that minimize harmful hallucinations and human intervention. Our key finding is that calculating and leveraging a new metric, which we call A-Feasibility, a measure of whether a given action is possible and safe in the provided scene, helps to mitigate hallucinations in LLM predictions and better align the LLM's confidence measure with the probability of success. We specifically propose an A-Feasibility metric which both combines scene context and prompting a LLM to determine if a given action is possible and safe in the scene, using the LLM's response to compute the score. Through experiments in both simulation and the real world on tasks with a variety of ambiguities, we show that LAP significantly increases success rate and decreases the amount of human intervention required relative to prior art. For example, in our real-world testing paradigm, LAP decreases the human help rate of previous methods by over 33% at a success rate of 70%.
Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection
State-of-the-art 3D object detectors are often trained on massive labeled datasets. However, annotating 3D bounding boxes remains prohibitively expensive and time-consuming, particularly for LiDAR. Instead, recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels. Contemporary methods adapt best-practices for self-supervised learning from the image domain to point clouds (such as contrastive learning). However, publicly available 3D datasets are considerably smaller and less diverse than those used for image-based self-supervised learning, limiting their effectiveness. We do note, however, that such 3D data is naturally collected in a multimodal fashion, often paired with images. Rather than pre-training with only self-supervised objectives, we argue that it is better to bootstrap point cloud representations using image-based foundation models trained on internet-scale data. Specifically, we propose a shelf-supervised approach (e.g. supervised with off-the-shelf image foundation models) for generating zero-shot 3D bounding boxes from paired RGB and LiDAR data. Pre-training 3D detectors with such pseudo-labels yields significantly better semi-supervised detection accuracy than prior self-supervised pretext tasks. Importantly, we show that image-based shelf-supervision is helpful for training LiDAR-only, RGB-only and multi-modal (RGB + LiDAR) detectors. We demonstrate the effectiveness of our approach on nuScenes and WOD, significantly improving over prior work in limited data settings. Our code is available at https://github.com/meharkhurana03/cm3d
comment: The first two authors contributed equally. This work has been accepted to the Conference on Robot Learning (CoRL) 2024
Equivariant Diffusion Policy
Recent work has shown diffusion models are an effective approach to learning the multimodal distributions arising from demonstration data in behavior cloning. However, a drawback of this approach is the need to learn a denoising function, which is significantly more complex than learning an explicit policy. In this work, we propose Equivariant Diffusion Policy, a novel diffusion policy learning method that leverages domain symmetries to obtain better sample efficiency and generalization in the denoising function. We theoretically analyze the $\mathrm{SO}(2)$ symmetry of full 6-DoF control and characterize when a diffusion model is $\mathrm{SO}(2)$-equivariant. We furthermore evaluate the method empirically on a set of 12 simulation tasks in MimicGen, and show that it obtains a success rate that is, on average, 21.9% higher than the baseline Diffusion Policy. We also evaluate the method on a real-world system to show that effective policies can be learned with relatively few training samples, whereas the baseline Diffusion Policy cannot.
comment: Conference on Robot Learning 2024, Oral Presentation
Learning Quadruped Locomotion Using Differentiable Simulation
This work explores the potential of using differentiable simulation for learning quadruped locomotion. Differentiable simulation promises fast convergence and stable training by computing low-variance first-order gradients using robot dynamics. However, its usage for legged robots is still limited to simulation. The main challenge lies in the complex optimization landscape of robotic tasks due to discontinuous dynamics. This work proposes a new differentiable simulation framework to overcome these challenges. Our approach combines a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. This approach maintains simulation accuracy by aligning the robot states from the surrogate model with those of the precise, non-differentiable simulator. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. We demonstrate that differentiable simulation outperforms a reinforcement learning algorithm (PPO) by achieving significantly better sample efficiency while maintaining its effectiveness in handling large-scale environments. Our method represents one of the first successful applications of differentiable simulation to real-world quadruped locomotion, offering a compelling alternative to traditional RL methods.
comment: 8th Annual Conference on Robot Learning (CoRL)
Explicit Contact Optimization in Whole-Body Contact-Rich Manipulation
Humans can exploit contacts anywhere on their body surface to manipulate large and heavy items, objects normally out of reach or multiple objects at once. However, such manipulation through contacts using the whole surface of the body remains extremely challenging to achieve on robots. This can be labelled as Whole-Body Contact-Rich Manipulation (WBCRM) problem. In addition to the high-dimensionality of the Contact-Rich Manipulation problem due to the combinatorics of contact modes, admitting contact creation anywhere on the body surface adds complexity, which hinders planning of manipulation within a reasonable time. We address this computational problem by formulating the contact and motion planning of planar WBCRM as hierarchical continuous optimization problems. To enable this formulation, we propose a novel continuous explicit representation of the robot surface, that we believe to be foundational for future research using continuous optimization for WBCRM. Our results demonstrate a significant improvement of convergence, planning time and feasibility - with, on the average, 99% less iterations and 96% reduction in time to find a solution over considered scenarios, without recourse to prone-to-failure trajectory refinement steps.
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
The ability to reflect on and correct failures is crucial for robotic systems to interact stably with real-life objects.Observing the generalization and reasoning capabilities of Multimodal Large Language Models (MLLMs), previous approaches have aimed to utilize these models to enhance robotic systems accordingly.However, these methods typically focus on high-level planning corrections using an additional MLLM, with limited utilization of failed samples to correct low-level contact poses which is particularly prone to occur during articulated object manipulation.To address this gap, we propose an Autonomous Interactive Correction (AIC) MLLM, which makes use of previous low-level interaction experiences to correct SE(3) pose predictions for articulated object. Specifically, AIC MLLM is initially fine-tuned to acquire both pose prediction and feedback prompt comprehension abilities.We design two types of prompt instructions for interactions with objects: 1) visual masks to highlight unmovable parts for position correction, and 2) textual descriptions to indicate potential directions for rotation correction. During inference, a Feedback Information Extraction module is introduced to recognize the failure cause, allowing AIC MLLM to adaptively correct the pose prediction using the corresponding prompts.To further enhance manipulation stability, we devise a Test Time Adaptation strategy that enables AIC MLLM to better adapt to the current scene configuration.Finally, extensive experiments are conducted in both simulated and real-world environments to evaluate the proposed method. The results demonstrate that our AIC MLLM can efficiently correct failure samples by leveraging interaction experience prompts.Our project website is https://sites.google.com/view/aic-mllm.
Optimizing Structured Data Processing through Robotic Process Automation
Robotic Process Automation (RPA) has emerged as a game-changing technology in data extraction, revolutionizing the way organizations process and analyze large volumes of documents such as invoices, purchase orders, and payment advices. This study investigates the use of RPA for structured data extraction and evaluates its advantages over manual processes. By comparing human-performed tasks with those executed by RPA software bots, we assess efficiency and accuracy in data extraction from invoices, focusing on the effectiveness of the RPA system. Through four distinct scenarios involving varying numbers of invoices, we measure efficiency in terms of time and effort required for task completion, as well as accuracy by comparing error rates between manual and RPA processes. Our findings highlight the significant efficiency gains achieved by RPA, with bots completing tasks in significantly less time compared to manual efforts across all cases. Moreover, the RPA system consistently achieves perfect accuracy, mitigating the risk of errors and enhancing process reliability. These results underscore the transformative potential of RPA in optimizing operational efficiency, reducing human labor costs, and improving overall business performance.
Reasoning Grasping via Multimodal Large Language Model
Despite significant progress in robotic systems for operation within human-centric environments, existing models still heavily rely on explicit human commands to identify and manipulate specific objects. This limits their effectiveness in environments where understanding and acting on implicit human intentions are crucial. In this study, we introduce a novel task: reasoning grasping, where robots need to generate grasp poses based on indirect verbal instructions or intentions. To accomplish this, we propose an end-to-end reasoning grasping model that integrates a multimodal Large Language Model (LLM) with a vision-based robotic grasping framework. In addition, we present the first reasoning grasping benchmark dataset generated from the GraspNet-1 billion, incorporating implicit instructions for object-level and part-level grasping. Our results show that directly integrating CLIP or LLaVA with the grasp detection model performs poorly on the challenging reasoning grasping tasks, while our proposed model demonstrates significantly enhanced performance both in the reasoning grasping benchmark and real-world experiments.
comment: CoRL 2024
Ego-to-Exo: Interfacing Third Person Visuals from Egocentric Views in Real-time for Improved ROV Teleoperation
Underwater ROVs (Remotely Operated Vehicles) are unmanned submersible vehicles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator's ability to maneuver the ROV in complex deep-water missions. In this paper, we present an interactive teleoperation interface that enhances the operational capabilities via increased situational awareness. This is accomplished by (i) offering on-demand "third"-person (exocentric) visuals from past egocentric views, and (ii) facilitating enhanced peripheral information with augmented ROV pose information in real-time. We achieve this by integrating a 3D geometry-based Ego-to-Exo view synthesis algorithm into a monocular SLAM system for accurate trajectory estimation. The proposed closed-form solution only uses past egocentric views from the ROV and a SLAM backbone for pose estimation, which makes it portable to existing ROV platforms. Unlike data-driven solutions, it is invariant to applications and waterbody-specific scenes. We validate the geometric accuracy of the proposed framework through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. A subjective evaluation on 15 human teleoperators further confirms the effectiveness of the integrated features for improved teleoperation. We demonstrate the benefits of dynamic Ego-to-Exo view generation and real-time pose rendering for remote ROV teleoperation by following navigation guides such as cavelines inside underwater caves. This new way of interactive ROV teleoperation opens up promising opportunities for future research in subsea telerobotics.
comment: V3, 9 pages
Learning to Singulate Objects in Packed Environments using a Dexterous Hand
Robotic object singulation, where a robot must isolate, grasp, and retrieve a target object in a cluttered environment, is a fundamental challenge in robotic manipulation. This task is difficult due to occlusions and how other objects act as obstacles for manipulation. A robot must also reason about the effect of object-object interactions as it tries to singulate the target. Prior work has explored object singulation in scenarios where there is enough free space to perform relatively long pushes to separate objects, in contrast to when space is tight and objects have little separation from each other. In this paper, we propose the Singulating Objects in Packed Environments (SOPE) framework. We propose a novel method that involves a displacement-based state representation and a multi-phase reinforcement learning procedure that enables singulation using the 16-DOF Allegro Hand. We demonstrate extensive experiments in Isaac Gym simulation, showing the ability of our system to singulate a target object in clutter. We directly transfer the policy trained in simulation to the real world. Over 250 physical robot manipulation trials, our method obtains success rates of 79.2%, outperforming alternative learning and non-learning methods.
Harmonic Mobile Manipulation
Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently. However, robots are still impotent in many household tasks requiring coordinated behaviors such as opening doors. The factorization of navigation and manipulation, while effective for some tasks, fails in scenarios requiring coordinated actions. To address this challenge, we introduce, HarmonicMM, an end-to-end learning method that optimizes both navigation and manipulation, showing notable improvement over existing techniques in everyday tasks. This approach is validated in simulated and real-world environments and adapts to novel unseen settings without additional tuning. Our contributions include a new benchmark for mobile manipulation and the successful deployment with only RGB visual observation in a real unseen apartment, demonstrating the potential for practical indoor robot deployment in daily life. More results are on our project site: https://rchalyang.github.io/HarmonicMM/
comment: More results are on our project site: https://rchalyang.github.io/HarmonicMM/
M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes
We propose M^3Bench, a new benchmark of whole-body motion generation for mobile manipulation tasks. Given a 3D scene context, M^3Bench requires an embodied agent to understand its configuration, environmental constraints and task objectives, then generate coordinated whole-body motion trajectories for object rearrangement tasks. M^3Bench features 30k object rearrangement tasks across 119 diverse scenes, providing expert demonstrations generated by our newly developed M^3BenchMaker. This automatic data generation tool produces coordinated whole-body motion trajectories from high-level task instructions, requiring only basic scene and robot information. Our benchmark incorporates various task splits to assess generalization across different dimensions and leverages realistic physics simulation for trajectory evaluation. Through extensive experimental analyses, we reveal that state-of-the-art models still struggle with coordinated base-arm motion while adhering to environment-context and task-specific constraints, highlighting the need to develop new models that address this gap. Through M^3Bench, we aim to facilitate future robotics research towards more adaptive and capable mobile manipulation in diverse, real-world environments.
comment: Code and data set will be released after acceptance
OrbitGrasp: $SE(3)$-Equivariant Grasp Learning
While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our main contribution is to propose an $SE(3)$-equivariant model that maps each point in the cloud to a continuous grasp quality function over the 2-sphere $S^2$ using a spherical harmonic basis. Compared with reasoning about a finite set of samples, this formulation improves the accuracy and efficiency of our model when a large number of samples would otherwise be needed. In order to accomplish this, we propose a novel variation on EquiFormerV2 that leverages a UNet-style encoder-decoder architecture to enlarge the number of points the model can handle. Our resulting method, which we name $\textit{OrbitGrasp}$, significantly outperforms baselines in both simulation and physical experiments.
comment: Conference on Robot Learning 2024
Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for various sequential decision-making and control tasks. Unlike their single-agent counterparts, multi-agent systems necessitate successful cooperation among the agents. The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals. These challenges become more pronounced under partial observability and the lack of prior knowledge about agent heterogeneity. While notable studies use intrinsic motivation (IM) to address reward sparsity or cooperation in decentralized settings, those dealing with heterogeneity typically assume centralized training, parameter sharing, and agent indexing. To overcome these limitations, we propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies in decentralized settings, under the challenges of partial observability and reward sparsity. Evaluation of CoHet in the Multi-agent Particle Environment (MPE) and Vectorized Multi-Agent Simulator (VMAS) benchmarks demonstrates superior performance compared to the state-of-the-art in a range of cooperative multi-agent scenarios. Our research is supplemented by an analysis of the impact of the agent dynamics model on the intrinsic motivation module, insights into the performance of different CoHet variants, and its robustness to an increasing number of heterogeneous agents.
comment: 9 pages, 5 figures
Dynamic Open Vocabulary Enhanced Safe-landing with Intelligence (DOVESEI) IROS 2023
This work targets what we consider to be the foundational step for urban airborne robots, a safe landing. Our attention is directed toward what we deem the most crucial aspect of the safe landing perception stack: segmentation. We present a streamlined reactive UAV system that employs visual servoing by harnessing the capabilities of open vocabulary image segmentation. This approach can adapt to various scenarios with minimal adjustments, bypassing the necessity for extensive data accumulation for refining internal models, thanks to its open vocabulary methodology. Given the limitations imposed by local authorities, our primary focus centers on operations originating from altitudes of 100 meters. This choice is deliberate, as numerous preceding works have dealt with altitudes up to 30 meters, aligning with the capabilities of small stereo cameras. Consequently, we leave the remaining 20m to be navigated using conventional 3D path planning methods. Utilizing monocular cameras and image segmentation, our findings demonstrate the system's capability to successfully execute landing maneuvers at altitudes as low as 20 meters. However, this approach is vulnerable to intermittent and occasionally abrupt fluctuations in the segmentation between frames in a video stream. To address this challenge, we enhance the image segmentation output by introducing what we call a dynamic focus: a masking mechanism that self adjusts according to the current landing stage. This dynamic focus guides the control system to avoid regions beyond the drone's safety radius projected onto the ground, thus mitigating the problems with fluctuations. Through the implementation of this supplementary layer, our experiments have reached improvements in the landing success rate of almost tenfold when compared to global segmentation. All the source code is open source and available online (github.com/MISTLab/DOVESEI).
comment: IROS 2023 The Last-Mile Robotics Workshop
FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation
We introduce a novel approach for manipulating articulated objects which are visually ambiguous, such doors which are symmetric or which are heavily occluded. These ambiguities can cause uncertainty over different possible articulation modes: for instance, when the articulation direction (e.g. push, pull, slide) or location (e.g. left side, right side) of a fully closed door are uncertain, or when distinguishing features like the plane of the door are occluded due to the viewing angle. To tackle these challenges, we propose a history-aware diffusion network that can model multi-modal distributions over articulation modes for articulated objects; our method further uses observation history to distinguish between modes and make stable predictions under occlusions. Experiments and analysis demonstrate that our method achieves state-of-art performance on articulated object manipulation and dramatically improves performance for articulated objects containing visual ambiguities. Our project website is available at https://flowbothd.github.io/.
comment: Accepted to CoRL 2024
Multiagent Systems
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks
Recent advancements in large language model (LLM)-based agents have demonstrated that collective intelligence can significantly surpass the capabilities of individual agents, primarily due to well-crafted inter-agent communication topologies. Despite the diverse and high-performing designs available, practitioners often face confusion when selecting the most effective pipeline for their specific task: \textit{Which topology is the best choice for my task, avoiding unnecessary communication token overhead while ensuring high-quality solution?} In response to this dilemma, we introduce G-Designer, an adaptive, efficient, and robust solution for multi-agent deployment, which dynamically designs task-aware, customized communication topologies. Specifically, G-Designer models the multi-agent system as a multi-agent network, leveraging a variational graph auto-encoder to encode both the nodes (agents) and a task-specific virtual node, and decodes a task-adaptive and high-performing communication topology. Extensive experiments on six benchmarks showcase that G-Designer is: \textbf{(1) high-performing}, achieving superior results on MMLU with accuracy at $84.50\%$ and on HumanEval with pass@1 at $89.90\%$; \textbf{(2) task-adaptive}, architecting communication protocols tailored to task difficulty, reducing token consumption by up to $95.33\%$ on HumanEval; and \textbf{(3) adversarially robust}, defending against agent adversarial attacks with merely $0.3\%$ accuracy drop.
Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search
Reinforcement learning has achieved remarkable success in perfect information games such as Go and Atari, enabling agents to compete at the highest levels against human players. However, research in reinforcement learning for imperfect information games has been relatively limited due to the more complex game structures and randomness. Traditional methods face challenges in training and improving performance in imperfect information games due to issues like inaccurate Q value estimation and reward sparsity. In this paper, we focus on Uno, an imperfect information game, and aim to address these problems by reducing Q value overestimation and reshaping reward function. We propose a novel algorithm that utilizes Monte Carlo Tree Search to improve the value estimation in Q function. Even though we choose Double Deep Q Learning as the foundational framework in this paper, our method can be generalized and used in any algorithm which needs Q value estimation, such as the Actor-Critic. Additionally, we employ Monte Carlo Tree Search to reshape the reward structure in the game environment. We compared our algorithm with several traditional methods applied to games such as Double Deep Q Learning, Deep Monte Carlo and Neural Fictitious Self Play, and the experiments demonstrate that our algorithm consistently outperforms these approaches, especially as the number of players in Uno increases, indicating a higher level of difficulty.
Agent-Based Modelling of Older Adult Needs for Autonomous Mobility-on-Demand: A Case Study in Winnipeg, Canada
As the populations continue to age across many nations, ensuring accessible and efficient transportation options for older adults has become an increasingly important concern. Autonomous Mobility-on-Demand (AMoD) systems have emerged as a potential solution to address the needs faced by older adults in their daily mobility. However, estimation of older adult mobility needs, and how they vary over space and time, is crucial for effective planning and implementation of such service, and conventional four-step approaches lack the granularity to fully account for these needs. To address this challenge, we propose an agent-based model of older adults mobility demand in Winnipeg, Canada. The model is built for 2022 using primarily open data, and is implemented in the Multi-Agent Transport Simulation (MATSim) toolkit. After calibration to accurately reproduce observed travel behaviors, a new AMoD service is tested in simulation and its potential adoption among Winnipeg older adults is explored. The model can help policy makers to estimate the needs of the elderly populations for door-to-door transportation and can guide the design of AMoD transport systems.
MFC-EQ: Mean-Field Control with Envelope Q-Learning for Moving Decentralized Agents in Formation IROS 2024
We study a decentralized version of Moving Agents in Formation (MAiF), a variant of Multi-Agent Path Finding aiming to plan collision-free paths for multiple agents with the dual objectives of reaching their goals quickly while maintaining a desired formation. The agents must balance these objectives under conditions of partial observation and limited communication. The formation maintenance depends on the joint state of all agents, whose dimensionality increases exponentially with the number of agents, rendering the learning process intractable. Additionally, learning a single policy that can accommodate different linear preferences for these two objectives presents a significant challenge. In this paper, we propose Mean-Field Control with Envelop $Q$-learning (MFC-EQ), a scalable and adaptable learning framework for this bi-objective multi-agent problem. We approximate the dynamics of all agents using mean-field theory while learning a universal preference-agnostic policy through envelop $Q$-learning. Our empirical evaluation of MFC-EQ across numerous instances shows that it outperforms state-of-the-art centralized MAiF baselines. Furthermore, MFC-EQ effectively handles more complex scenarios where the desired formation changes dynamically -- a challenge that existing MAiF planners cannot address.
comment: Accepted to IROS 2024
Analyzing Incentives and Fairness in Ordered Weighted Average for Facility Location Games
Facility location games provide an abstract model of mechanism design. In such games, a mechanism takes a profile of $n$ single-peaked preferences over an interval as an input and determines the location of a facility on the interval. In this paper, we restrict our attention to distance-based single-peaked preferences and focus on a well-known class of parameterized mechanisms called ordered weighted average methods, which is proposed by Yager in 1988 and contains several practical implementations such as the standard average and the Olympic average. We comprehensively analyze their performance in terms of both incentives and fairness. More specifically, we provide necessary and sufficient conditions on their parameters to achieve strategy-proofness, non-obvious manipulability, individual fair share, and proportional fairness, respectively.
Plurals: A System for Guiding LLMs Via Simulated Social Ensembles
Recent debates raised concerns that language models may favor certain viewpoints. But what if the solution is not to aim for a 'view from nowhere' but rather to leverage different viewpoints? We introduce Plurals, a system and Python library for pluralistic AI deliberation. Plurals consists of Agents (LLMs, optionally with personas) which deliberate within customizable Structures, with Moderators overseeing deliberation. Plurals is a generator of simulated social ensembles. Plurals integrates with government datasets to create nationally representative personas, includes deliberation templates inspired by democratic deliberation theory, and allows users to customize both information-sharing structures and deliberation behavior within Structures. Six case studies demonstrate fidelity to theoretical constructs and efficacy. Three randomized experiments show simulated focus groups produced output resonant with an online sample of the relevant audiences (chosen over zero-shot generation in 75% of trials). Plurals is both a paradigm and a concrete system for pluralistic AI. The Plurals library is available at https://github.com/josh-ashkinaze/plurals and will be continually updated.
Toward Universal and Interpretable World Models for Open-ended Learning Agents
We introduce a generic, compositional and interpretable class of generative world models that supports open-ended learning agents. This is a sparse class of Bayesian networks capable of approximating a broad range of stochastic processes, which provide agents with the ability to learn world models in a manner that may be both interpretable and computationally scalable. This approach integrating Bayesian structure learning and intrinsically motivated (model-based) planning enables agents to actively develop and refine their world models, which may lead to developmental learning and more robust, adaptive behavior.
comment: 4 pages including appendix, 6 including appendix and references; 2 figures
Agent Planning with World Knowledge Model NeurIPS 2024
Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ``real'' physical world. Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning. Concretely, we steer the agent model to self-synthesize knowledge from both expert and sampled trajectories. Then we develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines. Besides, we analyze to illustrate that our WKM can effectively alleviate the blind trial-and-error and hallucinatory action issues, providing strong support for the agent's understanding of the world. Other interesting findings include: 1) our instance-level task knowledge can generalize better to unseen tasks, 2) weak WKM can guide strong agent model planning, and 3) unified WKM training has promising potential for further development. The code is available at https://github.com/zjunlp/WKM.
comment: NeurIPS 2024
Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for various sequential decision-making and control tasks. Unlike their single-agent counterparts, multi-agent systems necessitate successful cooperation among the agents. The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals. These challenges become more pronounced under partial observability and the lack of prior knowledge about agent heterogeneity. While notable studies use intrinsic motivation (IM) to address reward sparsity or cooperation in decentralized settings, those dealing with heterogeneity typically assume centralized training, parameter sharing, and agent indexing. To overcome these limitations, we propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies in decentralized settings, under the challenges of partial observability and reward sparsity. Evaluation of CoHet in the Multi-agent Particle Environment (MPE) and Vectorized Multi-Agent Simulator (VMAS) benchmarks demonstrates superior performance compared to the state-of-the-art in a range of cooperative multi-agent scenarios. Our research is supplemented by an analysis of the impact of the agent dynamics model on the intrinsic motivation module, insights into the performance of different CoHet variants, and its robustness to an increasing number of heterogeneous agents.
comment: 9 pages, 5 figures
The Condorcet Dimension of Metric Spaces
A Condorcet winning set is a set of candidates such that no other candidate is preferred by at least half the voters over all members of the set. The Condorcet dimension, which is the minimum cardinality of a Condorcet winning set, is known to be at most logarithmic in the number of candidates. We study the case of elections where voters and candidates are located in a $2$-dimensional space with preferences based upon proximity voting. Our main result is that the Condorcet dimension is at most $3$, under both the Manhattan norm and the infinity norm, natural measures in electoral systems.
comment: 9 pages
AI, Pluralism, and (Social) Compensation
One strategy in response to pluralistic values in a user population is to personalize an AI system: if the AI can adapt to the specific values of each individual, then we can potentially avoid many of the challenges of pluralism. Unfortunately, this approach creates a significant ethical issue: if there is an external measure of success for the human-AI team, then the adaptive AI system may develop strategies (sometimes deceptive) to compensate for its human teammate. This phenomenon can be viewed as a form of social compensation, where the AI makes decisions based not on predefined goals but on its human partner's deficiencies in relation to the team's performance objectives. We provide a practical ethical analysis of the conditions in which such compensation may nonetheless be justifiable.
comment: 10 pages
Towards Rationality in Language and Multimodal Agents: A Survey
Rationality is the quality of being guided by reason, characterized by decision-making that aligns with evidence and logical principles. It plays a crucial role in reliable problem-solving by ensuring well-grounded and consistent solutions. While large language models (LLMs) have made significant progress in generating human-like text, they still exhibit limitations such as bounded knowledge space and inconsistent outputs. In response, recent efforts have shifted toward developing multimodal and multi-agent systems, as well as integrating modules like external tools, programming codes, symbolic reasoners, utility function, and conformal risk controls rather than relying solely on a single LLM for decision-making. This paper surveys the state-of-the-art advancements in language and multimodal agents, evaluates how they contribute to make intelligent agents more rational, and identifies open challenges and future research directions. We maintain an open repository at https://github.com/bowen-upenn/Agent_Rationality.
comment: We maintain an open repository at https://github.com/bowen-upenn/Agent_Rationality
Systems and Control (CS)
PD-Based and SINDy Nonlinear Dynamics Identification of UAVs for MPC Design
This paper presents a comprehensive approach to nonlinear dynamics identification for UAVs using a combination of data-driven techniques and theoretical modeling. Two key methodologies are explored: Proportional-Derivative (PD) approximation and Sparse Identification of Nonlinear Dynamics (SINDy). The UAV dynamics are first modeled using the Euler-Lagrange formulation, providing a set of generalized coordinates. However, platform constraints limit the control inputs to attitude angles, and linear and angular velocities along the z-axis. To accommodate these limitations, thrust and torque inputs are approximated using a PD controller, serving as the foundation for nonlinear system identification. In parallel, SINDy, a data-driven method, is employed to derive a compact and interpretable model of the UAV dynamics from experimental data. Both identified models are then integrated into a Model Predictive Control (MPC) framework for accurate trajectory tracking, where model accuracy, informed by data-driven insights, plays a critical role in optimizing control performance. This fusion of data-driven approaches and theoretical modeling enhances the system's robustness and adaptability in real-world conditions, offering a detailed analysis of the UAV's dynamic behavior.
Technical Report of 1:10 Scale Autonomous Vehicle Robot
This paper presents Auriga Robotics' autonomous vehicle, developed at Shahid Beheshti University's Robotics and Intelligent Automation Lab, as part of the team's entry for the 2024 RoboCup IranOpen competition. The vehicle is a 1:10 scale car equipped with a custom-designed chassis, a stepper motor for precision, and a range of sensors for autonomous navigation. Key hardware includes ESP32 microcontrollers that manage motor control and sensor data acquisition. The software system integrates computer vision, including YOLOv8 for sign detection and PiNet for lane detection, combined with control algorithms such as the Stanley, PID, and Pure Pursuit controllers. The vehicle's design emphasizes real-time decision-making, environmental mapping, and efficient localization, ensuring its ability to navigate complex driving scenarios.
A study on applications of various Energy Generation in pure Electric Vehicles: progress towards sustainability
The present work is an attempt to understand and review existing methods of energy generation in electric vehicles in the modern day context. Previous works in the field have proposed various mechanisms of energy generation that are very well adaptable to commercial scale uses and can be used as alternative power sourcing for electric vehicles having nil or very low environmental impact. The paper discusses strategies such as photovoltaic cell systems, regenerative braking, fuel cell, thermoelectric generators and micro wind-turbines with adequate propositions to select them on the basis of their suitability. The document also includes important formulas that can be used for individual modeling and designing. The paper emphasises on introducing the mechanisms that can be introduced as assistive mechanisms or secondary sources so that the range and other parameters are not compromised.
Robust control of Z-source inverter operated BLDC motor using Sliding Mode Control for Electric Vehicle applications
The rapid development and expansion of the EV market marked by the advent of third decade of the 21st century has improved the possibility of a sustainable automotive future. The present EV drivetrain run by BLDC motor has become increasingly complicated thus requiring efficient and accurate controls. The paper begins with discussing the problems in existing models, the research then focuses on increasing the robustness of the system towards disturbances and uncertainties by using Sliding Mode Control to control the ZSI, which has been chosen as the main power converter topology in place of VSI or CSI. The introduction of SMC has improved the performance of the drivetrain when applied with Vehicle dynamics over a Drive Cycle.
Improving the Accuracy of DC Optimal Power Flow Formulations via Parameter Optimization
DC Optimal Power Flow (DC-OPF) problems optimize the generators' active power setpoints while satisfying constraints based on the DC power flow linearization. The computational tractability advantages of DC-OPF problems come at the expense of inaccuracies relative to AC Optimal Power Flow (AC-OPF) problems which accurately model the nonlinear steady-state behavior of power grids. This paper proposes an algorithm that significantly improves the accuracy of the generators' active power setpoints from DC-OPF problems with respect to the corresponding AC-OPF problems over a specified range of operating conditions. Using sensitivity information in a machine learning-inspired methodology, this algorithm tunes coefficient and bias parameters in the DC power flow approximation to improve the accuracy of the resulting DC-OPF solutions. Employing the Truncated Newton Conjugate-Gradient (TNC) method -- a Quasi-Newton optimization technique -- this parameter tuning occurs during an offline training phase, with the resulting parameters then used in online computations. Numerical results underscore the algorithm's efficacy with accuracy improvements in squared two-norm and $\infty$-norm losses of up to $90\%$ and $79\%$, respectively, relative to traditional DC-OPF formulations.
Safety Filtering While Training: Improving the Performance and Sample Efficiency of Reinforcement Learning Agents
Reinforcement learning (RL) controllers are flexible and performant but rarely guarantee safety. Safety filters impart hard safety guarantees to RL controllers while maintaining flexibility. However, safety filters can cause undesired behaviours due to the separation between the controller and the safety filter, often degrading performance and robustness. In this paper, we propose several modifications to incorporating the safety filter in training RL controllers rather than solely applying it during evaluation. The modifications allow the RL controller to learn to account for the safety filter, improving performance. Additionally, our modifications significantly improve sample efficiency and eliminate training-time constraint violations. We verified the proposed modifications in simulated and real experiments with a Crazyflie 2.0 drone. In experiments, we show that the proposed training approaches require significantly fewer environment interactions and improve performance by up to 20% compared to standard RL training.
comment: 8 pages, 9 figures. Code is publicly available at https://github.com/Federico-PizarroBejarano/safe-control-gym/tree/training_rl_paper
A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction
The development of autonomous driving has boosted the research on autonomous racing. However, existing local trajectory planning methods have difficulty planning trajectories with optimal velocity profiles at racetracks with sharp corners, thus weakening the performance of autonomous racing. To address this problem, we propose a local trajectory planning method that integrates Velocity Prediction based on Model Predictive Contour Control (VPMPCC). The optimal parameters of VPMPCC are learned through Bayesian Optimization (BO) based on a proposed novel Objective Function adapted to Racing (OFR). Specifically, VPMPCC achieves velocity prediction by encoding the racetrack as a reference velocity profile and incorporating it into the optimization problem. This method optimizes the velocity profile of local trajectories, especially at corners with significant curvature. The proposed OFR balances racing performance with vehicle safety, ensuring safe and efficient BO training. In the simulation, the number of training iterations for OFR-based BO is reduced by 42.86% compared to the state-of-the-art method. The optimal simulation-trained parameters are then applied to a real-world F1TENTH vehicle without retraining. During prolonged racing on a custom-built racetrack featuring significant sharp corners, the mean velocity of VPMPCC reaches 93.18% of the vehicle's handling limits. The released code is available at https://github.com/zhouhengli/VPMPCC.
Attitude Estimation via Matrix Fisher Distributions on SO(3) Using Non-Unit Vector Measurements
This note presents a novel Bayesian attitude estimator with the matrix Fisher distribution on the special orthogonal group, which can smoothly accommodate both unit and non-unit vector measurements. The posterior attitude distribution is proven to be a matrix Fisher distribution with the assumption that non-unit vector measurement errors follow the isotropic Gaussian distributions and unit vector measurements follow the von-Mises Fisher distributions. Next, a global unscented transformation is proposed to approximate the full likelihood distribution with a matrix Fisher distribution for more generic cases of vector measurement errors following the non-isotropic Gaussian distributions. Following these, a Bayesian attitude estimator with the matrix Fisher distribution is constructed. Numerical examples are then presented. The proposed estimator exhibits advantageous performance compared with the previous attitude estimator with matrix Fisher distributions and the classic multiplicative extended Kalman filter in the case of non-unit vector measurements.
comment: 10 pages, 4 figures
Demo: Testing AI-driven MAC Learning in Autonomic Networks
6G networks will be highly dynamic, re-configurable, and resilient. To enable and support such features, employing AI has been suggested. Integrating AIin networks will likely require distributed AI deployments with resilient connectivity, e.g., for communication between RL agents and environment. Such approaches need to be validated in realistic network environments. In this demo, we use ContainerNet to emulate AI-capable and autonomic networks that employ the routing protocol KIRA to provide resilient connectivity and service discovery. As an example AI application, we train and infer deep RL agents learning medium access control (MAC) policies for a wireless network environment in the emulated network.
comment: Accepted for presentation in the Demo Session at the IEEE International Conference on Network Protocols (ICNP), 2024
Optimizing Version Innovation Age for Monitoring Markovian Source in Energy-Harvesting Systems
We study the real-time remote tracking of a two-state Markov process by an energy harvesting source. The source decides whether to transmit over an unreliable channel based on the state. We formulate this scenario as a Markov decision process (MDP) to determine the optimal transmission policy that minimizes the average Version Innovation Age (VIA) as a performance metric. We demonstrate that the optimal transmission policy is threshold-based, determined by the battery level, source state, and VIA value. We numerically verify the analytical structure of the optimal policy and compare the performance of our proposed policy against two baseline policies across various system parameters, establishing the superior performance of our approach.
Survey on Neighbor Discovery and Beam Alignment in mmWave-Enabled UAV Swarm Networks
Millimeter wave (mmWave)-enabled unmanned aerial vehicle (UAV) swarm networks (UAVSNs) can utilize a large spectrum of resources to provide low latency and high data transmission rate. Additionally, owing to the short wavelength, UAVs equipped with large antenna arrays can form secure narrow directive beam to establish communication with less interference. However, due to the high UAV mobility, limited beam coverage, beam misalignment, and high path loss, it is very challenging to adopt the mmWave communication in UAVSNs. In this article, we present a comprehensive survey on neighbor discovery and beam alignment techniques for directional communication in mmWave-enabled UAVSNs. The existing techniques are reviewed and compared with each other. We also discuss key open issues and challenges with potential research direction.
Quantification of Non-stationary Power Quality Events: A New Index Based on $\ell_p$-norm of Energy
The present study proposes a new index to quantify the severity of non-stationary power quality (PQ) disturbance events. In particular, the severity of PQ events is estimated from their energy distribution in temporal-frequency space. The index essentially measures the $\ell_p$-norm between the energy distributions of an event and the nominal voltage signal. The efficacy of the new index is demonstrated considering a wide class of major non-stationary PQ events, including sag, swell, interruptions, oscillatory transients, and simultaneous events. The results of this investigation, with simulated, real and experimental data, convincingly demonstrate that the proposed index is generic, monotonic, easy to interpret, and can accurately quantify the severity of non-stationary events.
comment: 15 pages
Hessian-Informed Flow Matching
Modeling complex systems that evolve toward equilibrium distributions is important in various physical applications, including molecular dynamics and robotic control. These systems often follow the stochastic gradient descent of an underlying energy function, converging to stationary distributions around energy minima. The local covariance of these distributions is shaped by the energy landscape's curvature, often resulting in anisotropic characteristics. While flow-based generative models have gained traction in generating samples from equilibrium distributions in such applications, they predominately employ isotropic conditional probability paths, limiting their ability to capture such covariance structures. In this paper, we introduce Hessian-Informed Flow Matching (HI-FM), a novel approach that integrates the Hessian of an energy function into conditional flows within the flow matching framework. This integration allows HI-FM to account for local curvature and anisotropic covariance structures. Our approach leverages the linearization theorem from dynamical systems and incorporates additional considerations such as time transformations and equivariance. Empirical evaluations on the MNIST and Lennard-Jones particles datasets demonstrate that HI-FM improves the likelihood of test samples.
comment: In submission
pycvxset: A Python package for convex set manipulation
This paper introduces pycvxset, a new Python package to manipulate and visualize convex sets. We support polytopes and ellipsoids, and provide user-friendly methods to perform a variety of set operations. For polytopes, pycvxset supports the standard halfspace/vertex representation as well as the constrained zonotope representation. The main advantage of constrained zonotope representations over standard halfspace/vertex representations is that constrained zonotopes admit closed-form expressions for several set operations. pycvxset uses CVXPY to solve various convex programs arising in set operations, and uses pycddlib to perform vertex-halfspace enumeration. We demonstrate the use of pycvxset in analyzing and controlling dynamical systems in Python. pycvxset is available at https://github.com/merlresearch/pycvxset under the AGPL-3.0-or-later license, along with documentation and examples.
comment: 8 pages, 10 figures
FBC-Enhanced ε-Effective Capacity Optimization for NOMA
The advent of massive ultra-reliable and low-latency communications (mURLLC) has introduced a critical class of time- and reliability-sensitive services within next-generation wireless networks. This shift has attracted significant research attention, driven by the need to meet stringent quality-of-service (QoS) requirements. In this context, non-orthogonal multiple access (NOMA) systems have emerged as a promising solution to enhance mURLLC performance by providing substantial enhancements in both spectral efficiency and massive connectivity, particularly through the development of finite blocklength coding (FBC) techniques. Nevertheless, owing to the dynamic nature of wireless network environments and the complex architecture of FBC-enhanced NOMA systems, the research on the efficient design of optimizing the system performance for maximizing system capacity while guaranteeing the tail distributions in terms of new statistical QoS constraints for delay and error-rate is still in its infancy. In an effort to address these challenges, we put forth the formulation and solution of {\epsilon}-effective capacity problems tailored for uplink FBC-enhanced NOMA systems, specifically catering to ensure statistical delay and error-rate bounded QoS requirements. In particular, we establish uplink two-user FBC-enhanced NOMA system models by applying the hybrid successive interference cancellation (SIC). We also develop the concept of the {\epsilon}-effective capacity and propose the optimal power allocation policies to maximize the {\epsilon}-effective capacity and {\epsilon}-effective energy efficiency while upper-bounding both delay and error-rate. We conduct a set of simulations to validate and evaluate our developed optimization schemes over FBC-enhanced NOMA systems.
Communication-Control Codesign for Large-Scale Wireless Networked Control Systems
Wireless Networked Control Systems (WNCSs) are essential to Industry 4.0, enabling flexible control in applications, such as drone swarms and autonomous robots. The interdependence between communication and control requires integrated design, but traditional methods treat them separately, leading to inefficiencies. Current codesign approaches often rely on simplified models, focusing on single-loop or independent multi-loop systems. However, large-scale WNCSs face unique challenges, including coupled control loops, time-correlated wireless channels, trade-offs between sensing and control transmissions, and significant computational complexity. To address these challenges, we propose a practical WNCS model that captures correlated dynamics among multiple control loops with spatially distributed sensors and actuators sharing limited wireless resources over multi-state Markov block-fading channels. We formulate the codesign problem as a sequential decision-making task that jointly optimizes scheduling and control inputs across estimation, control, and communication domains. To solve this problem, we develop a Deep Reinforcement Learning (DRL) algorithm that efficiently handles the hybrid action space, captures communication-control correlations, and ensures robust training despite sparse cross-domain variables and floating control inputs. Extensive simulations show that the proposed DRL approach outperforms benchmarks and solves the large-scale WNCS codesign problem, providing a scalable solution for industrial automation.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Strategic and Fair Aggregator Interactions in Energy Markets: Mutli-agent Dynamics and Quasiconcave Games
The introduction of aggregator structures has proven effective in bringing fairness to energy resource allocation by negotiating for more resources and economic surplus on behalf of users. This paper extends the fair energy resource allocation problem to a multi-agent setting, focusing on interactions among multiple aggregators in an electricity market. We prove that the strategic optimization problems faced by the aggregators form a quasiconcave game, ensuring the existence of a Nash equilibrium. This resolves complexities related to market price dependencies on total purchases and balancing fairness and efficiency in energy allocation. In addition, we design simulations to characterize the equilibrium points of the induced game, demonstrating how aggregators stabilize market outcomes, ensure fair resource distribution, and optimize user surplus. Our findings offer a robust framework for understanding strategic interactions among aggregators, contributing to more efficient and equitable energy markets.
Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning
The Internet of Underwater Things (IoUT) offers significant potential for ocean exploration but encounters challenges due to dynamic underwater environments and severe signal attenuation. Current methods relying on Autonomous Underwater Vehicles (AUVs) based on online reinforcement learning (RL) lead to high computational costs and low data utilization. To address these issues and the constraints of turbulent ocean environments, we propose a multi-AUV assisted data collection framework for IoUT based on multi-agent offline RL. This framework maximizes data rate and the value of information (VoI), minimizes energy consumption, and ensures collision avoidance by utilizing environmental and equipment status data. We introduce a semi-communication decentralized training with decentralized execution (SC-DTDE) paradigm and a multi-agent independent conservative Q-learning algorithm (MAICQL) to effectively tackle the problem. Extensive simulations demonstrate the high applicability, robustness, and data collection efficiency of the proposed framework.
Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples ICML 2024
A driving force behind the diverse applicability of modern machine learning is the ability to extract meaningful features across many sources. However, many practical domains involve data that are non-identically distributed across sources, and statistically dependent within its source, violating vital assumptions in existing theoretical studies. Toward addressing these issues, we establish statistical guarantees for learning general $\textit{nonlinear}$ representations from multiple data sources that admit different input distributions and possibly dependent data. Specifically, we study the sample-complexity of learning $T+1$ functions $f_\star^{(t)} \circ g_\star$ from a function class $\mathcal F \times \mathcal G$, where $f_\star^{(t)}$ are task specific linear functions and $g_\star$ is a shared nonlinear representation. A representation $\hat g$ is estimated using $N$ samples from each of $T$ source tasks, and a fine-tuning function $\hat f^{(0)}$ is fit using $N'$ samples from a target task passed through $\hat g$. We show that when $N \gtrsim C_{\mathrm{dep}} (\mathrm{dim}(\mathcal F) + \mathrm{C}(\mathcal G)/T)$, the excess risk of $\hat f^{(0)} \circ \hat g$ on the target task decays as $\nu_{\mathrm{div}} \big(\frac{\mathrm{dim}(\mathcal F)}{N'} + \frac{\mathrm{C}(\mathcal G)}{N T} \big)$, where $C_{\mathrm{dep}}$ denotes the effect of data dependency, $\nu_{\mathrm{div}}$ denotes an (estimatable) measure of $\textit{task-diversity}$ between the source and target tasks, and $\mathrm C(\mathcal G)$ denotes the complexity of the representation class $\mathcal G$. In particular, our analysis reveals: as the number of tasks $T$ increases, both the sample requirement and risk bound converge to that of $r$-dimensional regression as if $g_\star$ had been given, and the effect of dependency only enters the sample requirement, leaving the risk bound matching the iid setting.
comment: Appeared at ICML 2024
EFILN: The Electric Field Inversion-Localization Network for High-Precision Underwater Positioning
Accurate underwater target localization is essential for underwater exploration. To improve accuracy and efficiency in complex underwater environments, we propose the Electric Field Inversion-Localization Network (EFILN), a deep feedforward neural network that reconstructs position coordinates from underwater electric field signals. By assessing whether the neural network's input-output values satisfy the Coulomb law, the error between the network's inversion solution and the equation's exact solution can be determined. The Adam optimizer was employed first, followed by the L-BFGS optimizer, to progressively improve the output precision of EFILN. A series of noise experiments demonstrated the robustness and practical utility of the proposed method, while small sample data experiments validated its strong small-sample learning (SSL) capabilities. To accelerate relevant research, we have made the codes available as open-source.
Reinforcement Learning Based Bidding Framework with High-dimensional Bids in Power Markets
Over the past decade, bidding in power markets has attracted widespread attention. Reinforcement Learning (RL) has been widely used for power market bidding as a powerful AI tool to make decisions under real-world uncertainties. However, current RL methods mostly employ low dimensional bids, which significantly diverge from the N price-power pairs commonly used in the current power markets. The N-pair bidding format is denoted as High Dimensional Bids (HDBs), which has not been fully integrated into the existing RL-based bidding methods. The loss of flexibility in current RL bidding methods could greatly limit the bidding profits and make it difficult to tackle the rising uncertainties brought by renewable energy generations. In this paper, we intend to propose a framework to fully utilize HDBs for RL-based bidding methods. First, we employ a special type of neural network called Neural Network Supply Functions (NNSFs) to generate HDBs in the form of N price-power pairs. Second, we embed the NNSF into a Markov Decision Process (MDP) to make it compatible with most existing RL methods. Finally, experiments on Energy Storage Systems (ESSs) in the PJM Real-Time (RT) power market show that the proposed bidding method with HDBs can significantly improve bidding flexibility, thereby improving the profit of the state-of-the-art RL bidding methods.
A Lyapunov-Based Switching Scheme for Selecting the Stable Closed-Loop Fixed Attitude-Error Quaternion During Flight
We present a switching scheme, which uses both the attitude-error quaternion (AEQ) and the angular-velocity error, for controlling the rotational degrees of freedom of an uncrewed aerial vehicle (UAV) during flight. In this approach, the proposed controller continually selects the stable closed-loop (CL) equilibrium AEQ corresponding to the smallest cost between those computed with two energy-based Lyapunov functions. To analyze and enforce the stability of the CL switching dynamics, we use basic nonlinear theory. This research problem is relevant because the selection of the stable CL equilibrium AEQ directly determines the power and energy requirements of the controlled UAV during flight. To test and demonstrate the implementation, suitability, functionality, and performance of the proposed approach, we present experimental results obtained using a 31-gram quadrotor, which was controlled to execute high-speed yaw maneuvers in flight. These flight tests show that the proposed switching controller can respectively reduce the control effort and rotational power by as much as 49.75 % and 28.14 %, on average, compared to those corresponding to an often-used benchmark controller.
comment: 8 pages, 5 figures, 2024 7th Iberian Robotics Conference (ROBOT)
System-Level Analysis of Module Uncertainty Quantification in the Autonomy Pipeline
We present a novel perspective on the design, use, and role of uncertainty measures for learned modules in an autonomous system. While in the current literature uncertainty measures are produced for standalone modules without considering the broader system context, in our work we explicitly consider the role of decision-making under uncertainty in illuminating how "good'" an uncertainty measure is. Our insights are centered around quantifying the ways in which being uncertainty-aware makes a system more robust. Firstly, we use level set generation tools to produce a measure for system robustness and use this measure to compare system designs, thus placing uncertainty quantification in the context of system performance and evaluation metrics. Secondly, we use the concept of specification generation from systems theory to produce a formulation under which a designer can simultaneously constrain the properties of an uncertainty measure and analyze the efficacy of the decision-making-under-uncertainty algorithm used by the system. We apply our analyses to two real-world and complex autonomous systems, one for autonomous driving and another for aircraft runway incursion detection, helping to form a toolbox for an uncertainty-aware system designer to produce more effective and robust systems.
Parallel Batch Scheduling With Incompatible Job Families Via Constraint Programming
This paper addresses the incompatible case of parallel batch scheduling, where compatible jobs belong to the same family, and jobs from different families cannot be processed together in the same batch. Existing constraint programming (CP) models for this problem fail to synchronize the processing of the jobs within their batch, resulting in batch interruptions. In the context of the diffusion area in the semiconductor manufacturing process, these interrupted solutions would disrupt the thermal stability required for a uniform dopant distribution on the wafers. This paper proposes three new CP models that directly tackle these interruptions in the formulation, including two adaptions of existing models and a novel Redundant Synchronized (RS) model. These existing and novel models are compared on standard test cases, demonstrating the superiority of the RS model in finding optimal or near-optimal solutions quickly.
comment: 11 pages, 6 figures
Physical Informed-Inspired Deep Reinforcement Learning Based Bi-Level Programming for Microgrid Scheduling
To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations of karush-kuhn-tucker (KKT)-based methods, a novel optimization solution method based on DRL theory is proposed to handle the bi-level programming through alternate iterations between levels. Specifically, by combining a DRL algorithm named asynchronous advantage actor-critic (A3C) and automated machine learning-prioritized experience replay (AutoML-PER) strategy to improve the generalization performance of A3C to address the above problems, an improved A3C algorithm, called AutoML-PER-A3C, is designed to solve the upper-level problem; while the DOCPLEX optimizer is adopted to address the lower-level problem. In this solution process, AutoML is used to automatically optimize hyperparameters and PER improves learning efficiency and quality by extracting the most valuable samples. The test results demonstrate that the presented approach manages to reconcile the interests between multiple stakeholders in MG by fully exploiting various flexibility resources. Furthermore, in terms of economic viability and computational efficiency, the proposal vastly exceeds other advanced reinforcement learning methods.
comment: Accepted by IEEE Transactions on Industry Applications (Paper Id: 2023-KDSEM-1058)
Marine spatial planning techniques with a case study on wave-powered offshore aquaculture farms
As emerging marine technologies lead to the development of new infrastructure across the ocean, they enter an environment that existing ecosystems and industries already rely on. Although necessary to provide sustainable sources of energy and food, careful planning will be important to make informed decisions and avoid conflicts. This paper examines several techniques used for marine spatial planning, an approach for analyzing and planning the use of marine resources. Using open source software including QGIS and Python, the potential for developing wave-powered offshore aquaculture farms using the RM3 wave energy converter along the Northeast coast of the United States is assessed and several feasible sites are identified. The optimal site, located at 43.7{\deg}N, 68.9{\deg}W along the coast of Maine, has a total cost for a 5-pen farm of $56.8M, annual fish yield of 676 tonnes, and a levelized cost of fish of $9.23 per kilogram. Overall trends indicate that the cost greatly decreases with distance to shore due to the greater availability of wave energy and that conflicts and environmental constraints significantly limit the number of feasible sites in this region.
Mindalogue: LLM-Powered Nonlinear Interaction for Effective Learning and Task Exploration
Current generative AI models like ChatGPT, Claude, and Gemini are widely used for knowledge dissemination, task decomposition, and creative thinking. However, their linear interaction methods often force users to repeatedly compare and copy contextual information when handling complex tasks, increasing cognitive load and operational costs. Moreover, the ambiguity in model responses requires users to refine and simplify the information further. To address these issues, we developed "Mindalogue", a system using a non-linear interaction model based on "nodes + canvas" to enhance user efficiency and freedom while generating structured responses. A formative study with 11 users informed the design of Mindalogue, which was then evaluated through a study with 16 participants. The results showed that Mindalogue significantly reduced task steps and improved users' comprehension of complex information. This study highlights the potential of non-linear interaction in improving AI tool efficiency and user experience in the HCI field.
comment: 17 pages, 9 figures
Prompt a Robot to Walk with Large Language Models
Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains. Recently, there has been escalating interest in deploying LLMs for robotics, aiming to harness the power of foundation models in real-world settings. However, this approach faces significant challenges, particularly in grounding these models in the physical world and in generating dynamic robot motions. To address these issues, we introduce a novel paradigm in which we use few-shot prompts collected from the physical environment, enabling the LLM to autoregressively generate low-level control commands for robots without task-specific fine-tuning. Experiments across various robots and environments validate that our method can effectively prompt a robot to walk. We thus illustrate how LLMs can proficiently function as low-level feedback controllers for dynamic motion control even in high-dimensional robotic systems. The project website and source code can be found at: https://prompt2walk.github.io/ .
comment: Conference on Decision and Control (CDC), 2024
Mobile Edge Generation-Enabled Digital Twin: Architecture Design and Research Opportunities
A novel paradigm of mobile edge generation (MEG)-enabled digital twin (DT) is proposed, which enables distributed on-device generation at mobile edge networks for real-time DT applications. First, an MEG-DT architecture is put forward to decentralize generative artificial intelligence (GAI) models onto edge servers (ESs) and user equipments (UEs), which has the advantages of low latency, privacy preservation, and individual-level customization. Then, various single-user and multi-user generation mechanisms are conceived for MEG-DT, which strike trade-offs between generation latency, hardware costs, and device coordination. Furthermore, to perform efficient distributed generation, two operating protocols are explored for transmitting interpretable and latent features between ESs and UEs, namely sketch-based generation and seed-based generation, respectively. Based on the proposed protocols, the convergence between MEG and DT are highlighted. Considering the seed-based image generation scenario, numerical case studies are provided to reveal the superiority of MEG-DT over centralized generation. Finally, promising applications and research opportunities are identified. Code is available at https://github.com/xiaoxiaxusummer/MEG_DT
comment: Accepted by IEEE Communications Magazine
The Reachability Problem for Neural-Network Control Systems
A control system consists of a plant component and a controller which periodically computes a control input for the plant. We consider systems where the controller is implemented by a feedforward neural network with ReLU activations. The reachability problem asks, given a set of initial states, whether a set of target states can be reached. We show that this problem is undecidable even for trivial plants and fixed-depth neural networks with three inputs and outputs. We also show that the problem becomes semi-decidable when the plant as well as the input and target sets are given by automata over infinite words.
On Adaptive Frequency Sampling for Data-driven MOR Applied to Antenna Responses
Frequency domain sweeps of array antennas are well-known to be time-intensive, and different surrogate models have been used to improve the performance. Data-driven model order reduction algorithms, such as the Loewner framework and vector fitting, can be integrated with these adaptive error estimates, in an iterative algorithm, to reduce the number of full-wave simulations required to accurately capture the requested frequency behavior of multiport array antennas. In this work, we propose two novel adaptive methods exploiting a block matrix function which is a key part of the Loewner framework generating system approach. The first algorithm leverages an inherent matrix parameter freedom in the block matrix function to identify frequency points with large errors, whereas the second utilizes the condition number of the block matrix function. Both methods effectively provide frequency domain error estimates, which are essential for improved performance. Numerical experiments on multiport array antenna S-parameters demonstrate the effectiveness of our proposed algorithms within the Loewner framework.
comment: 10 pages, 12 figures
MERIT: Multimodal Wearable Vital Sign Waveform Monitoring
Cardiovascular disease (CVD) is the leading cause of death and premature mortality worldwide, with occupational environments significantly influencing CVD risk, underscoring the need for effective cardiac monitoring and early warning systems. Existing methods of monitoring vital signs require subjects to remain stationary, which is impractical for daily monitoring as individuals are often in motion. To address this limitation, we propose MERIT, a multimodality-based wearable system designed for precise ECG waveform monitoring without movement restrictions. Daily activities, involving frequent arm movements, can significantly affect sensor data and complicate the reconstruction of accurate ECG signals. To mitigate motion impact and enhance ECG signal reconstruction, we introduce a deep independent component analysis (Deep-ICA) module and a multimodal fusion module. We conducted experiments with 15 subjects. Our results, compared with commercial wearable devices and existing methods, demonstrate that MERIT accurately reconstructs ECG waveforms during various office activities, offering a reliable solution for fine-grained cardiac monitoring in dynamic environments.
comment: 8 pages, 10 figures
Environmental management and restoration under unified risk and uncertainty using robustified dynamic Orlicz risk
Environmental management and restoration should be designed such that the risk and uncertainty owing to nonlinear stochastic systems can be successfully addressed. We apply the robustified dynamic Orlicz risk to the modeling and analysis of environmental management and restoration to consider both the risk and uncertainty within a unified theory. We focus on the control of a jump-driven hybrid stochastic system that represents macrophyte dynamics. The dynamic programming equation based on the Orlicz risk is first obtained heuristically, from which the associated Hamilton-Jacobi-Bellman (HJB) equation is derived. In the proposed Orlicz risk, the risk aversion of the decision-maker is represented by a power coefficient that resembles a certainty equivalence, whereas the uncertainty aversion is represented by the Kullback-Leibler divergence, in which the risk and uncertainty are handled consistently and separately. The HJB equation includes a new state-dependent discount factor that arises from the uncertainty aversion, which leads to a unique, nonlinear, and nonlocal term. The link between the proposed and classical stochastic control problems is discussed with a focus on control-dependent discount rates. We propose a finite difference method for computing the HJB equation. Finally, the proposed model is applied to an optimal harvesting problem for macrophytes in a brackish lake that contains both growing and drifting populations.
Deep Learning based Performance Testing for Analog Integrated Circuits
In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus. First, we apply a deep neural network (DNN) to establish the mapping from the response of the circuit under test (CUT) in each module to all specifications to be tested. Then, the required test modules are selected by solving a 0-1 integer programming problem. Finally, the predictions from the selected test modules are combined by a DNN to form the specification estimations. The simulation results validate the proposed approach in terms of testing accuracy and cost.
A game-theoretic, market-based approach to extract flexibility from distributed energy resources
We propose a market designed using game theory to optimally utilize the flexibility of distributed energy resources (DERs) like solar, batteries, electric vehicles, and flexible loads. Market agents perform multiperiod optimization to determine their feasible flexibility limits for power injections while satisfying all constraints of their DERs. This is followed by a Stackelberg game between the market operator and agents. The market operator as the leader aims to regulate the aggregate power injection around a desired value by leveraging the flexibility of their agents, and computes optimal prices for both electricity and flexibility services. The agents follow by optimally bidding their desired flexible power injections in response to these prices. We show the existence and uniqueness of a Nash equilibrium among all the agents and a Stackelberg equilibrium between all agents and the operator. In addition to deriving analytical closed-form solutions, we provide simulation results for a small example system to illustrate our approach.
comment: Accepted to the 5th IFAC Workshop on Cyber-Physical Human Systems
Systems and Control (EESS)
PD-Based and SINDy Nonlinear Dynamics Identification of UAVs for MPC Design
This paper presents a comprehensive approach to nonlinear dynamics identification for UAVs using a combination of data-driven techniques and theoretical modeling. Two key methodologies are explored: Proportional-Derivative (PD) approximation and Sparse Identification of Nonlinear Dynamics (SINDy). The UAV dynamics are first modeled using the Euler-Lagrange formulation, providing a set of generalized coordinates. However, platform constraints limit the control inputs to attitude angles, and linear and angular velocities along the z-axis. To accommodate these limitations, thrust and torque inputs are approximated using a PD controller, serving as the foundation for nonlinear system identification. In parallel, SINDy, a data-driven method, is employed to derive a compact and interpretable model of the UAV dynamics from experimental data. Both identified models are then integrated into a Model Predictive Control (MPC) framework for accurate trajectory tracking, where model accuracy, informed by data-driven insights, plays a critical role in optimizing control performance. This fusion of data-driven approaches and theoretical modeling enhances the system's robustness and adaptability in real-world conditions, offering a detailed analysis of the UAV's dynamic behavior.
Technical Report of 1:10 Scale Autonomous Vehicle Robot
This paper presents Auriga Robotics' autonomous vehicle, developed at Shahid Beheshti University's Robotics and Intelligent Automation Lab, as part of the team's entry for the 2024 RoboCup IranOpen competition. The vehicle is a 1:10 scale car equipped with a custom-designed chassis, a stepper motor for precision, and a range of sensors for autonomous navigation. Key hardware includes ESP32 microcontrollers that manage motor control and sensor data acquisition. The software system integrates computer vision, including YOLOv8 for sign detection and PiNet for lane detection, combined with control algorithms such as the Stanley, PID, and Pure Pursuit controllers. The vehicle's design emphasizes real-time decision-making, environmental mapping, and efficient localization, ensuring its ability to navigate complex driving scenarios.
A study on applications of various Energy Generation in pure Electric Vehicles: progress towards sustainability
The present work is an attempt to understand and review existing methods of energy generation in electric vehicles in the modern day context. Previous works in the field have proposed various mechanisms of energy generation that are very well adaptable to commercial scale uses and can be used as alternative power sourcing for electric vehicles having nil or very low environmental impact. The paper discusses strategies such as photovoltaic cell systems, regenerative braking, fuel cell, thermoelectric generators and micro wind-turbines with adequate propositions to select them on the basis of their suitability. The document also includes important formulas that can be used for individual modeling and designing. The paper emphasises on introducing the mechanisms that can be introduced as assistive mechanisms or secondary sources so that the range and other parameters are not compromised.
Robust control of Z-source inverter operated BLDC motor using Sliding Mode Control for Electric Vehicle applications
The rapid development and expansion of the EV market marked by the advent of third decade of the 21st century has improved the possibility of a sustainable automotive future. The present EV drivetrain run by BLDC motor has become increasingly complicated thus requiring efficient and accurate controls. The paper begins with discussing the problems in existing models, the research then focuses on increasing the robustness of the system towards disturbances and uncertainties by using Sliding Mode Control to control the ZSI, which has been chosen as the main power converter topology in place of VSI or CSI. The introduction of SMC has improved the performance of the drivetrain when applied with Vehicle dynamics over a Drive Cycle.
Improving the Accuracy of DC Optimal Power Flow Formulations via Parameter Optimization
DC Optimal Power Flow (DC-OPF) problems optimize the generators' active power setpoints while satisfying constraints based on the DC power flow linearization. The computational tractability advantages of DC-OPF problems come at the expense of inaccuracies relative to AC Optimal Power Flow (AC-OPF) problems which accurately model the nonlinear steady-state behavior of power grids. This paper proposes an algorithm that significantly improves the accuracy of the generators' active power setpoints from DC-OPF problems with respect to the corresponding AC-OPF problems over a specified range of operating conditions. Using sensitivity information in a machine learning-inspired methodology, this algorithm tunes coefficient and bias parameters in the DC power flow approximation to improve the accuracy of the resulting DC-OPF solutions. Employing the Truncated Newton Conjugate-Gradient (TNC) method -- a Quasi-Newton optimization technique -- this parameter tuning occurs during an offline training phase, with the resulting parameters then used in online computations. Numerical results underscore the algorithm's efficacy with accuracy improvements in squared two-norm and $\infty$-norm losses of up to $90\%$ and $79\%$, respectively, relative to traditional DC-OPF formulations.
Safety Filtering While Training: Improving the Performance and Sample Efficiency of Reinforcement Learning Agents
Reinforcement learning (RL) controllers are flexible and performant but rarely guarantee safety. Safety filters impart hard safety guarantees to RL controllers while maintaining flexibility. However, safety filters can cause undesired behaviours due to the separation between the controller and the safety filter, often degrading performance and robustness. In this paper, we propose several modifications to incorporating the safety filter in training RL controllers rather than solely applying it during evaluation. The modifications allow the RL controller to learn to account for the safety filter, improving performance. Additionally, our modifications significantly improve sample efficiency and eliminate training-time constraint violations. We verified the proposed modifications in simulated and real experiments with a Crazyflie 2.0 drone. In experiments, we show that the proposed training approaches require significantly fewer environment interactions and improve performance by up to 20% compared to standard RL training.
comment: 8 pages, 9 figures. Code is publicly available at https://github.com/Federico-PizarroBejarano/safe-control-gym/tree/training_rl_paper
A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction
The development of autonomous driving has boosted the research on autonomous racing. However, existing local trajectory planning methods have difficulty planning trajectories with optimal velocity profiles at racetracks with sharp corners, thus weakening the performance of autonomous racing. To address this problem, we propose a local trajectory planning method that integrates Velocity Prediction based on Model Predictive Contour Control (VPMPCC). The optimal parameters of VPMPCC are learned through Bayesian Optimization (BO) based on a proposed novel Objective Function adapted to Racing (OFR). Specifically, VPMPCC achieves velocity prediction by encoding the racetrack as a reference velocity profile and incorporating it into the optimization problem. This method optimizes the velocity profile of local trajectories, especially at corners with significant curvature. The proposed OFR balances racing performance with vehicle safety, ensuring safe and efficient BO training. In the simulation, the number of training iterations for OFR-based BO is reduced by 42.86% compared to the state-of-the-art method. The optimal simulation-trained parameters are then applied to a real-world F1TENTH vehicle without retraining. During prolonged racing on a custom-built racetrack featuring significant sharp corners, the mean velocity of VPMPCC reaches 93.18% of the vehicle's handling limits. The released code is available at https://github.com/zhouhengli/VPMPCC.
Attitude Estimation via Matrix Fisher Distributions on SO(3) Using Non-Unit Vector Measurements
This note presents a novel Bayesian attitude estimator with the matrix Fisher distribution on the special orthogonal group, which can smoothly accommodate both unit and non-unit vector measurements. The posterior attitude distribution is proven to be a matrix Fisher distribution with the assumption that non-unit vector measurement errors follow the isotropic Gaussian distributions and unit vector measurements follow the von-Mises Fisher distributions. Next, a global unscented transformation is proposed to approximate the full likelihood distribution with a matrix Fisher distribution for more generic cases of vector measurement errors following the non-isotropic Gaussian distributions. Following these, a Bayesian attitude estimator with the matrix Fisher distribution is constructed. Numerical examples are then presented. The proposed estimator exhibits advantageous performance compared with the previous attitude estimator with matrix Fisher distributions and the classic multiplicative extended Kalman filter in the case of non-unit vector measurements.
comment: 10 pages, 4 figures
Demo: Testing AI-driven MAC Learning in Autonomic Networks
6G networks will be highly dynamic, re-configurable, and resilient. To enable and support such features, employing AI has been suggested. Integrating AIin networks will likely require distributed AI deployments with resilient connectivity, e.g., for communication between RL agents and environment. Such approaches need to be validated in realistic network environments. In this demo, we use ContainerNet to emulate AI-capable and autonomic networks that employ the routing protocol KIRA to provide resilient connectivity and service discovery. As an example AI application, we train and infer deep RL agents learning medium access control (MAC) policies for a wireless network environment in the emulated network.
comment: Accepted for presentation in the Demo Session at the IEEE International Conference on Network Protocols (ICNP), 2024
Optimizing Version Innovation Age for Monitoring Markovian Source in Energy-Harvesting Systems
We study the real-time remote tracking of a two-state Markov process by an energy harvesting source. The source decides whether to transmit over an unreliable channel based on the state. We formulate this scenario as a Markov decision process (MDP) to determine the optimal transmission policy that minimizes the average Version Innovation Age (VIA) as a performance metric. We demonstrate that the optimal transmission policy is threshold-based, determined by the battery level, source state, and VIA value. We numerically verify the analytical structure of the optimal policy and compare the performance of our proposed policy against two baseline policies across various system parameters, establishing the superior performance of our approach.
Survey on Neighbor Discovery and Beam Alignment in mmWave-Enabled UAV Swarm Networks
Millimeter wave (mmWave)-enabled unmanned aerial vehicle (UAV) swarm networks (UAVSNs) can utilize a large spectrum of resources to provide low latency and high data transmission rate. Additionally, owing to the short wavelength, UAVs equipped with large antenna arrays can form secure narrow directive beam to establish communication with less interference. However, due to the high UAV mobility, limited beam coverage, beam misalignment, and high path loss, it is very challenging to adopt the mmWave communication in UAVSNs. In this article, we present a comprehensive survey on neighbor discovery and beam alignment techniques for directional communication in mmWave-enabled UAVSNs. The existing techniques are reviewed and compared with each other. We also discuss key open issues and challenges with potential research direction.
Quantification of Non-stationary Power Quality Events: A New Index Based on $\ell_p$-norm of Energy
The present study proposes a new index to quantify the severity of non-stationary power quality (PQ) disturbance events. In particular, the severity of PQ events is estimated from their energy distribution in temporal-frequency space. The index essentially measures the $\ell_p$-norm between the energy distributions of an event and the nominal voltage signal. The efficacy of the new index is demonstrated considering a wide class of major non-stationary PQ events, including sag, swell, interruptions, oscillatory transients, and simultaneous events. The results of this investigation, with simulated, real and experimental data, convincingly demonstrate that the proposed index is generic, monotonic, easy to interpret, and can accurately quantify the severity of non-stationary events.
comment: 15 pages
Hessian-Informed Flow Matching
Modeling complex systems that evolve toward equilibrium distributions is important in various physical applications, including molecular dynamics and robotic control. These systems often follow the stochastic gradient descent of an underlying energy function, converging to stationary distributions around energy minima. The local covariance of these distributions is shaped by the energy landscape's curvature, often resulting in anisotropic characteristics. While flow-based generative models have gained traction in generating samples from equilibrium distributions in such applications, they predominately employ isotropic conditional probability paths, limiting their ability to capture such covariance structures. In this paper, we introduce Hessian-Informed Flow Matching (HI-FM), a novel approach that integrates the Hessian of an energy function into conditional flows within the flow matching framework. This integration allows HI-FM to account for local curvature and anisotropic covariance structures. Our approach leverages the linearization theorem from dynamical systems and incorporates additional considerations such as time transformations and equivariance. Empirical evaluations on the MNIST and Lennard-Jones particles datasets demonstrate that HI-FM improves the likelihood of test samples.
comment: In submission
pycvxset: A Python package for convex set manipulation
This paper introduces pycvxset, a new Python package to manipulate and visualize convex sets. We support polytopes and ellipsoids, and provide user-friendly methods to perform a variety of set operations. For polytopes, pycvxset supports the standard halfspace/vertex representation as well as the constrained zonotope representation. The main advantage of constrained zonotope representations over standard halfspace/vertex representations is that constrained zonotopes admit closed-form expressions for several set operations. pycvxset uses CVXPY to solve various convex programs arising in set operations, and uses pycddlib to perform vertex-halfspace enumeration. We demonstrate the use of pycvxset in analyzing and controlling dynamical systems in Python. pycvxset is available at https://github.com/merlresearch/pycvxset under the AGPL-3.0-or-later license, along with documentation and examples.
comment: 8 pages, 10 figures
FBC-Enhanced ε-Effective Capacity Optimization for NOMA
The advent of massive ultra-reliable and low-latency communications (mURLLC) has introduced a critical class of time- and reliability-sensitive services within next-generation wireless networks. This shift has attracted significant research attention, driven by the need to meet stringent quality-of-service (QoS) requirements. In this context, non-orthogonal multiple access (NOMA) systems have emerged as a promising solution to enhance mURLLC performance by providing substantial enhancements in both spectral efficiency and massive connectivity, particularly through the development of finite blocklength coding (FBC) techniques. Nevertheless, owing to the dynamic nature of wireless network environments and the complex architecture of FBC-enhanced NOMA systems, the research on the efficient design of optimizing the system performance for maximizing system capacity while guaranteeing the tail distributions in terms of new statistical QoS constraints for delay and error-rate is still in its infancy. In an effort to address these challenges, we put forth the formulation and solution of {\epsilon}-effective capacity problems tailored for uplink FBC-enhanced NOMA systems, specifically catering to ensure statistical delay and error-rate bounded QoS requirements. In particular, we establish uplink two-user FBC-enhanced NOMA system models by applying the hybrid successive interference cancellation (SIC). We also develop the concept of the {\epsilon}-effective capacity and propose the optimal power allocation policies to maximize the {\epsilon}-effective capacity and {\epsilon}-effective energy efficiency while upper-bounding both delay and error-rate. We conduct a set of simulations to validate and evaluate our developed optimization schemes over FBC-enhanced NOMA systems.
Communication-Control Codesign for Large-Scale Wireless Networked Control Systems
Wireless Networked Control Systems (WNCSs) are essential to Industry 4.0, enabling flexible control in applications, such as drone swarms and autonomous robots. The interdependence between communication and control requires integrated design, but traditional methods treat them separately, leading to inefficiencies. Current codesign approaches often rely on simplified models, focusing on single-loop or independent multi-loop systems. However, large-scale WNCSs face unique challenges, including coupled control loops, time-correlated wireless channels, trade-offs between sensing and control transmissions, and significant computational complexity. To address these challenges, we propose a practical WNCS model that captures correlated dynamics among multiple control loops with spatially distributed sensors and actuators sharing limited wireless resources over multi-state Markov block-fading channels. We formulate the codesign problem as a sequential decision-making task that jointly optimizes scheduling and control inputs across estimation, control, and communication domains. To solve this problem, we develop a Deep Reinforcement Learning (DRL) algorithm that efficiently handles the hybrid action space, captures communication-control correlations, and ensures robust training despite sparse cross-domain variables and floating control inputs. Extensive simulations show that the proposed DRL approach outperforms benchmarks and solves the large-scale WNCS codesign problem, providing a scalable solution for industrial automation.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Strategic and Fair Aggregator Interactions in Energy Markets: Mutli-agent Dynamics and Quasiconcave Games
The introduction of aggregator structures has proven effective in bringing fairness to energy resource allocation by negotiating for more resources and economic surplus on behalf of users. This paper extends the fair energy resource allocation problem to a multi-agent setting, focusing on interactions among multiple aggregators in an electricity market. We prove that the strategic optimization problems faced by the aggregators form a quasiconcave game, ensuring the existence of a Nash equilibrium. This resolves complexities related to market price dependencies on total purchases and balancing fairness and efficiency in energy allocation. In addition, we design simulations to characterize the equilibrium points of the induced game, demonstrating how aggregators stabilize market outcomes, ensure fair resource distribution, and optimize user surplus. Our findings offer a robust framework for understanding strategic interactions among aggregators, contributing to more efficient and equitable energy markets.
Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning
The Internet of Underwater Things (IoUT) offers significant potential for ocean exploration but encounters challenges due to dynamic underwater environments and severe signal attenuation. Current methods relying on Autonomous Underwater Vehicles (AUVs) based on online reinforcement learning (RL) lead to high computational costs and low data utilization. To address these issues and the constraints of turbulent ocean environments, we propose a multi-AUV assisted data collection framework for IoUT based on multi-agent offline RL. This framework maximizes data rate and the value of information (VoI), minimizes energy consumption, and ensures collision avoidance by utilizing environmental and equipment status data. We introduce a semi-communication decentralized training with decentralized execution (SC-DTDE) paradigm and a multi-agent independent conservative Q-learning algorithm (MAICQL) to effectively tackle the problem. Extensive simulations demonstrate the high applicability, robustness, and data collection efficiency of the proposed framework.
Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples ICML 2024
A driving force behind the diverse applicability of modern machine learning is the ability to extract meaningful features across many sources. However, many practical domains involve data that are non-identically distributed across sources, and statistically dependent within its source, violating vital assumptions in existing theoretical studies. Toward addressing these issues, we establish statistical guarantees for learning general $\textit{nonlinear}$ representations from multiple data sources that admit different input distributions and possibly dependent data. Specifically, we study the sample-complexity of learning $T+1$ functions $f_\star^{(t)} \circ g_\star$ from a function class $\mathcal F \times \mathcal G$, where $f_\star^{(t)}$ are task specific linear functions and $g_\star$ is a shared nonlinear representation. A representation $\hat g$ is estimated using $N$ samples from each of $T$ source tasks, and a fine-tuning function $\hat f^{(0)}$ is fit using $N'$ samples from a target task passed through $\hat g$. We show that when $N \gtrsim C_{\mathrm{dep}} (\mathrm{dim}(\mathcal F) + \mathrm{C}(\mathcal G)/T)$, the excess risk of $\hat f^{(0)} \circ \hat g$ on the target task decays as $\nu_{\mathrm{div}} \big(\frac{\mathrm{dim}(\mathcal F)}{N'} + \frac{\mathrm{C}(\mathcal G)}{N T} \big)$, where $C_{\mathrm{dep}}$ denotes the effect of data dependency, $\nu_{\mathrm{div}}$ denotes an (estimatable) measure of $\textit{task-diversity}$ between the source and target tasks, and $\mathrm C(\mathcal G)$ denotes the complexity of the representation class $\mathcal G$. In particular, our analysis reveals: as the number of tasks $T$ increases, both the sample requirement and risk bound converge to that of $r$-dimensional regression as if $g_\star$ had been given, and the effect of dependency only enters the sample requirement, leaving the risk bound matching the iid setting.
comment: Appeared at ICML 2024
EFILN: The Electric Field Inversion-Localization Network for High-Precision Underwater Positioning
Accurate underwater target localization is essential for underwater exploration. To improve accuracy and efficiency in complex underwater environments, we propose the Electric Field Inversion-Localization Network (EFILN), a deep feedforward neural network that reconstructs position coordinates from underwater electric field signals. By assessing whether the neural network's input-output values satisfy the Coulomb law, the error between the network's inversion solution and the equation's exact solution can be determined. The Adam optimizer was employed first, followed by the L-BFGS optimizer, to progressively improve the output precision of EFILN. A series of noise experiments demonstrated the robustness and practical utility of the proposed method, while small sample data experiments validated its strong small-sample learning (SSL) capabilities. To accelerate relevant research, we have made the codes available as open-source.
Reinforcement Learning Based Bidding Framework with High-dimensional Bids in Power Markets
Over the past decade, bidding in power markets has attracted widespread attention. Reinforcement Learning (RL) has been widely used for power market bidding as a powerful AI tool to make decisions under real-world uncertainties. However, current RL methods mostly employ low dimensional bids, which significantly diverge from the N price-power pairs commonly used in the current power markets. The N-pair bidding format is denoted as High Dimensional Bids (HDBs), which has not been fully integrated into the existing RL-based bidding methods. The loss of flexibility in current RL bidding methods could greatly limit the bidding profits and make it difficult to tackle the rising uncertainties brought by renewable energy generations. In this paper, we intend to propose a framework to fully utilize HDBs for RL-based bidding methods. First, we employ a special type of neural network called Neural Network Supply Functions (NNSFs) to generate HDBs in the form of N price-power pairs. Second, we embed the NNSF into a Markov Decision Process (MDP) to make it compatible with most existing RL methods. Finally, experiments on Energy Storage Systems (ESSs) in the PJM Real-Time (RT) power market show that the proposed bidding method with HDBs can significantly improve bidding flexibility, thereby improving the profit of the state-of-the-art RL bidding methods.
A Lyapunov-Based Switching Scheme for Selecting the Stable Closed-Loop Fixed Attitude-Error Quaternion During Flight
We present a switching scheme, which uses both the attitude-error quaternion (AEQ) and the angular-velocity error, for controlling the rotational degrees of freedom of an uncrewed aerial vehicle (UAV) during flight. In this approach, the proposed controller continually selects the stable closed-loop (CL) equilibrium AEQ corresponding to the smallest cost between those computed with two energy-based Lyapunov functions. To analyze and enforce the stability of the CL switching dynamics, we use basic nonlinear theory. This research problem is relevant because the selection of the stable CL equilibrium AEQ directly determines the power and energy requirements of the controlled UAV during flight. To test and demonstrate the implementation, suitability, functionality, and performance of the proposed approach, we present experimental results obtained using a 31-gram quadrotor, which was controlled to execute high-speed yaw maneuvers in flight. These flight tests show that the proposed switching controller can respectively reduce the control effort and rotational power by as much as 49.75 % and 28.14 %, on average, compared to those corresponding to an often-used benchmark controller.
comment: 8 pages, 5 figures, 2024 7th Iberian Robotics Conference (ROBOT)
System-Level Analysis of Module Uncertainty Quantification in the Autonomy Pipeline
We present a novel perspective on the design, use, and role of uncertainty measures for learned modules in an autonomous system. While in the current literature uncertainty measures are produced for standalone modules without considering the broader system context, in our work we explicitly consider the role of decision-making under uncertainty in illuminating how "good'" an uncertainty measure is. Our insights are centered around quantifying the ways in which being uncertainty-aware makes a system more robust. Firstly, we use level set generation tools to produce a measure for system robustness and use this measure to compare system designs, thus placing uncertainty quantification in the context of system performance and evaluation metrics. Secondly, we use the concept of specification generation from systems theory to produce a formulation under which a designer can simultaneously constrain the properties of an uncertainty measure and analyze the efficacy of the decision-making-under-uncertainty algorithm used by the system. We apply our analyses to two real-world and complex autonomous systems, one for autonomous driving and another for aircraft runway incursion detection, helping to form a toolbox for an uncertainty-aware system designer to produce more effective and robust systems.
Parallel Batch Scheduling With Incompatible Job Families Via Constraint Programming
This paper addresses the incompatible case of parallel batch scheduling, where compatible jobs belong to the same family, and jobs from different families cannot be processed together in the same batch. Existing constraint programming (CP) models for this problem fail to synchronize the processing of the jobs within their batch, resulting in batch interruptions. In the context of the diffusion area in the semiconductor manufacturing process, these interrupted solutions would disrupt the thermal stability required for a uniform dopant distribution on the wafers. This paper proposes three new CP models that directly tackle these interruptions in the formulation, including two adaptions of existing models and a novel Redundant Synchronized (RS) model. These existing and novel models are compared on standard test cases, demonstrating the superiority of the RS model in finding optimal or near-optimal solutions quickly.
comment: 11 pages, 6 figures
Physical Informed-Inspired Deep Reinforcement Learning Based Bi-Level Programming for Microgrid Scheduling
To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations of karush-kuhn-tucker (KKT)-based methods, a novel optimization solution method based on DRL theory is proposed to handle the bi-level programming through alternate iterations between levels. Specifically, by combining a DRL algorithm named asynchronous advantage actor-critic (A3C) and automated machine learning-prioritized experience replay (AutoML-PER) strategy to improve the generalization performance of A3C to address the above problems, an improved A3C algorithm, called AutoML-PER-A3C, is designed to solve the upper-level problem; while the DOCPLEX optimizer is adopted to address the lower-level problem. In this solution process, AutoML is used to automatically optimize hyperparameters and PER improves learning efficiency and quality by extracting the most valuable samples. The test results demonstrate that the presented approach manages to reconcile the interests between multiple stakeholders in MG by fully exploiting various flexibility resources. Furthermore, in terms of economic viability and computational efficiency, the proposal vastly exceeds other advanced reinforcement learning methods.
comment: Accepted by IEEE Transactions on Industry Applications (Paper Id: 2023-KDSEM-1058)
Marine spatial planning techniques with a case study on wave-powered offshore aquaculture farms
As emerging marine technologies lead to the development of new infrastructure across the ocean, they enter an environment that existing ecosystems and industries already rely on. Although necessary to provide sustainable sources of energy and food, careful planning will be important to make informed decisions and avoid conflicts. This paper examines several techniques used for marine spatial planning, an approach for analyzing and planning the use of marine resources. Using open source software including QGIS and Python, the potential for developing wave-powered offshore aquaculture farms using the RM3 wave energy converter along the Northeast coast of the United States is assessed and several feasible sites are identified. The optimal site, located at 43.7{\deg}N, 68.9{\deg}W along the coast of Maine, has a total cost for a 5-pen farm of $56.8M, annual fish yield of 676 tonnes, and a levelized cost of fish of $9.23 per kilogram. Overall trends indicate that the cost greatly decreases with distance to shore due to the greater availability of wave energy and that conflicts and environmental constraints significantly limit the number of feasible sites in this region.
Mindalogue: LLM-Powered Nonlinear Interaction for Effective Learning and Task Exploration
Current generative AI models like ChatGPT, Claude, and Gemini are widely used for knowledge dissemination, task decomposition, and creative thinking. However, their linear interaction methods often force users to repeatedly compare and copy contextual information when handling complex tasks, increasing cognitive load and operational costs. Moreover, the ambiguity in model responses requires users to refine and simplify the information further. To address these issues, we developed "Mindalogue", a system using a non-linear interaction model based on "nodes + canvas" to enhance user efficiency and freedom while generating structured responses. A formative study with 11 users informed the design of Mindalogue, which was then evaluated through a study with 16 participants. The results showed that Mindalogue significantly reduced task steps and improved users' comprehension of complex information. This study highlights the potential of non-linear interaction in improving AI tool efficiency and user experience in the HCI field.
comment: 17 pages, 9 figures
Prompt a Robot to Walk with Large Language Models
Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains. Recently, there has been escalating interest in deploying LLMs for robotics, aiming to harness the power of foundation models in real-world settings. However, this approach faces significant challenges, particularly in grounding these models in the physical world and in generating dynamic robot motions. To address these issues, we introduce a novel paradigm in which we use few-shot prompts collected from the physical environment, enabling the LLM to autoregressively generate low-level control commands for robots without task-specific fine-tuning. Experiments across various robots and environments validate that our method can effectively prompt a robot to walk. We thus illustrate how LLMs can proficiently function as low-level feedback controllers for dynamic motion control even in high-dimensional robotic systems. The project website and source code can be found at: https://prompt2walk.github.io/ .
comment: Conference on Decision and Control (CDC), 2024
Mobile Edge Generation-Enabled Digital Twin: Architecture Design and Research Opportunities
A novel paradigm of mobile edge generation (MEG)-enabled digital twin (DT) is proposed, which enables distributed on-device generation at mobile edge networks for real-time DT applications. First, an MEG-DT architecture is put forward to decentralize generative artificial intelligence (GAI) models onto edge servers (ESs) and user equipments (UEs), which has the advantages of low latency, privacy preservation, and individual-level customization. Then, various single-user and multi-user generation mechanisms are conceived for MEG-DT, which strike trade-offs between generation latency, hardware costs, and device coordination. Furthermore, to perform efficient distributed generation, two operating protocols are explored for transmitting interpretable and latent features between ESs and UEs, namely sketch-based generation and seed-based generation, respectively. Based on the proposed protocols, the convergence between MEG and DT are highlighted. Considering the seed-based image generation scenario, numerical case studies are provided to reveal the superiority of MEG-DT over centralized generation. Finally, promising applications and research opportunities are identified. Code is available at https://github.com/xiaoxiaxusummer/MEG_DT
comment: Accepted by IEEE Communications Magazine
The Reachability Problem for Neural-Network Control Systems
A control system consists of a plant component and a controller which periodically computes a control input for the plant. We consider systems where the controller is implemented by a feedforward neural network with ReLU activations. The reachability problem asks, given a set of initial states, whether a set of target states can be reached. We show that this problem is undecidable even for trivial plants and fixed-depth neural networks with three inputs and outputs. We also show that the problem becomes semi-decidable when the plant as well as the input and target sets are given by automata over infinite words.
On Adaptive Frequency Sampling for Data-driven MOR Applied to Antenna Responses
Frequency domain sweeps of array antennas are well-known to be time-intensive, and different surrogate models have been used to improve the performance. Data-driven model order reduction algorithms, such as the Loewner framework and vector fitting, can be integrated with these adaptive error estimates, in an iterative algorithm, to reduce the number of full-wave simulations required to accurately capture the requested frequency behavior of multiport array antennas. In this work, we propose two novel adaptive methods exploiting a block matrix function which is a key part of the Loewner framework generating system approach. The first algorithm leverages an inherent matrix parameter freedom in the block matrix function to identify frequency points with large errors, whereas the second utilizes the condition number of the block matrix function. Both methods effectively provide frequency domain error estimates, which are essential for improved performance. Numerical experiments on multiport array antenna S-parameters demonstrate the effectiveness of our proposed algorithms within the Loewner framework.
comment: 10 pages, 12 figures
MERIT: Multimodal Wearable Vital Sign Waveform Monitoring
Cardiovascular disease (CVD) is the leading cause of death and premature mortality worldwide, with occupational environments significantly influencing CVD risk, underscoring the need for effective cardiac monitoring and early warning systems. Existing methods of monitoring vital signs require subjects to remain stationary, which is impractical for daily monitoring as individuals are often in motion. To address this limitation, we propose MERIT, a multimodality-based wearable system designed for precise ECG waveform monitoring without movement restrictions. Daily activities, involving frequent arm movements, can significantly affect sensor data and complicate the reconstruction of accurate ECG signals. To mitigate motion impact and enhance ECG signal reconstruction, we introduce a deep independent component analysis (Deep-ICA) module and a multimodal fusion module. We conducted experiments with 15 subjects. Our results, compared with commercial wearable devices and existing methods, demonstrate that MERIT accurately reconstructs ECG waveforms during various office activities, offering a reliable solution for fine-grained cardiac monitoring in dynamic environments.
comment: 8 pages, 10 figures
Environmental management and restoration under unified risk and uncertainty using robustified dynamic Orlicz risk
Environmental management and restoration should be designed such that the risk and uncertainty owing to nonlinear stochastic systems can be successfully addressed. We apply the robustified dynamic Orlicz risk to the modeling and analysis of environmental management and restoration to consider both the risk and uncertainty within a unified theory. We focus on the control of a jump-driven hybrid stochastic system that represents macrophyte dynamics. The dynamic programming equation based on the Orlicz risk is first obtained heuristically, from which the associated Hamilton-Jacobi-Bellman (HJB) equation is derived. In the proposed Orlicz risk, the risk aversion of the decision-maker is represented by a power coefficient that resembles a certainty equivalence, whereas the uncertainty aversion is represented by the Kullback-Leibler divergence, in which the risk and uncertainty are handled consistently and separately. The HJB equation includes a new state-dependent discount factor that arises from the uncertainty aversion, which leads to a unique, nonlinear, and nonlocal term. The link between the proposed and classical stochastic control problems is discussed with a focus on control-dependent discount rates. We propose a finite difference method for computing the HJB equation. Finally, the proposed model is applied to an optimal harvesting problem for macrophytes in a brackish lake that contains both growing and drifting populations.
Deep Learning based Performance Testing for Analog Integrated Circuits
In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus. First, we apply a deep neural network (DNN) to establish the mapping from the response of the circuit under test (CUT) in each module to all specifications to be tested. Then, the required test modules are selected by solving a 0-1 integer programming problem. Finally, the predictions from the selected test modules are combined by a DNN to form the specification estimations. The simulation results validate the proposed approach in terms of testing accuracy and cost.
A game-theoretic, market-based approach to extract flexibility from distributed energy resources
We propose a market designed using game theory to optimally utilize the flexibility of distributed energy resources (DERs) like solar, batteries, electric vehicles, and flexible loads. Market agents perform multiperiod optimization to determine their feasible flexibility limits for power injections while satisfying all constraints of their DERs. This is followed by a Stackelberg game between the market operator and agents. The market operator as the leader aims to regulate the aggregate power injection around a desired value by leveraging the flexibility of their agents, and computes optimal prices for both electricity and flexibility services. The agents follow by optimally bidding their desired flexible power injections in response to these prices. We show the existence and uniqueness of a Nash equilibrium among all the agents and a Stackelberg equilibrium between all agents and the operator. In addition to deriving analytical closed-form solutions, we provide simulation results for a small example system to illustrate our approach.
comment: Accepted to the 5th IFAC Workshop on Cyber-Physical Human Systems
Robotics
Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies
Humanoid robots capable of autonomous operation in diverse environments have long been a goal for roboticists. However, autonomous manipulation by humanoid robots has largely been restricted to one specific scene, primarily due to the difficulty of acquiring generalizable skills. Recent advances in 3D visuomotor policies, such as the 3D Diffusion Policy (DP3), have shown promise in extending these capabilities to wilder environments. However, 3D visuomotor policies often rely on camera calibration and point-cloud segmentation, which present challenges for deployment on mobile robots like humanoids. In this work, we introduce the Improved 3D Diffusion Policy (iDP3), a novel 3D visuomotor policy that eliminates these constraints by leveraging egocentric 3D visual representations. We demonstrate that iDP3 enables a full-sized humanoid robot to autonomously perform skills in diverse real-world scenarios, using only data collected in the lab. Videos are available at: https://humanoid-manipulation.github.io
comment: Project website: https://humanoid-manipulation.github.io
Probabilistic Degeneracy Detection for Point-to-Plane Error Minimization
Degeneracies arising from uninformative geometry are known to deteriorate LiDAR-based localization and mapping. This work introduces a new probabilistic method to detect and mitigate the effect of degeneracies in point-to-plane error minimization. The noise on the Hessian of the point-to-plane optimization problem is characterized by the noise on points and surface normals used in its construction. We exploit this characterization to quantify the probability of a direction being degenerate. The degeneracy-detection procedure is used in a new real-time degeneracy-aware iterative closest point algorithm for LiDAR registration, in which we smoothly attenuate updates in degenerate directions. The method's parameters are selected based on the noise characteristics provided in the LiDAR's datasheet. We validate the approach in four real-world experiments, demonstrating that it outperforms state-of-the-art methods at detecting and mitigating the adverse effects of degeneracies. For the benefit of the community, we release the code for the method at: github.com/ntnu-arl/drpm.
comment: 8 pages, 5 figures, accepted by IEEE Robotics and Automation Letters (IEEE RAL). Supplementary video: https://www.youtube.com/watch?v=bKnHs_wwnXs. Code: https://github.com/ntnu-arl/drpm
Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation
Model-free reinforcement learning has emerged as a powerful method for developing robust robot control policies capable of navigating through complex and unstructured terrains. The effectiveness of these methods hinges on two essential elements: (1) the use of massively parallel physics simulations to expedite policy training, and (2) an environment generator tasked with crafting sufficiently challenging yet attainable terrains to facilitate continuous policy improvement. Existing methods of environment generation often rely on heuristics constrained by a set of parameters, limiting the diversity and realism. In this work, we introduce the Adaptive Diffusion Terrain Generator (ADTG), a novel method that leverages Denoising Diffusion Probabilistic Models to dynamically expand existing training environments by adding more diverse and complex terrains adaptive to the current policy. ADTG guides the diffusion model's generation process through initial noise optimization, blending noise-corrupted terrains from existing training environments weighted by the policy's performance in each corresponding environment. By manipulating the noise corruption level, ADTG seamlessly transitions between generating similar terrains for policy fine-tuning and novel ones to expand training diversity. Our experiments show that the policy trained by ADTG outperforms both procedural generated and natural environments, along with popular navigation methods.
Harnessing with Twisting: Single-Arm Deformable Linear Object Manipulation for Industrial Harnessing Task IROS 24
Wire-harnessing tasks pose great challenges to be automated by the robot due to the complex dynamics and unpredictable behavior of the deformable wire. Traditional methods, often reliant on dual-robot arms or tactile sensing, face limitations in adaptability, cost, and scalability. This paper introduces a novel single-robot wire-harnessing pipeline that leverages a robot's twisting motion to generate necessary wire tension for precise insertion into clamps, using only one robot arm with an integrated force/torque (F/T) sensor. Benefiting from this design, the single robot arm can efficiently apply tension for wire routing and insertion into clamps in a narrow space. Our approach is structured around four principal components: a Model Predictive Control (MPC) based on the Koopman operator for tension tracking and wire following, a motion planner for sequencing harnessing waypoints, a suite of insertion primitives for clamp engagement, and a fix-point switching mechanism for wire constraint updating. Evaluated on an industrial-level wire harnessing task, our method demonstrated superior performance and reliability over conventional approaches, efficiently handling both single and multiple wire configurations with high success rates.
comment: Accepted by IROS 24
Active Learning of Robot Vision Using Adaptive Path Planning
Robots need robust and flexible vision systems to perceive and reason about their environments beyond geometry. Most of such systems build upon deep learning approaches. As autonomous robots are commonly deployed in initially unknown environments, pre-training on static datasets cannot always capture the variety of domains and limits the robot's vision performance during missions. Recently, self-supervised as well as fully supervised active learning methods emerged to improve robotic vision. These approaches rely on large in-domain pre-training datasets or require substantial human labelling effort. To address these issues, we present a recent adaptive planning framework for efficient training data collection to substantially reduce human labelling requirements in semantic terrain monitoring missions. To this end, we combine high-quality human labels with automatically generated pseudo labels. Experimental results show that the framework reaches segmentation performance close to fully supervised approaches with drastically reduced human labelling effort while outperforming purely self-supervised approaches. We discuss the advantages and limitations of current methods and outline valuable future research avenues towards more robust and flexible robotic vision systems in unknown environments.
comment: 5 pages, 3 figures
MLP-SLAM: Multilayer Perceptron-Based Simultaneous Localization and Mapping With a Dynamic and Static Object Discriminator
The Visual Simultaneous Localization and Mapping (V-SLAM) system has seen significant development in recent years, demonstrating high precision in environments with limited dynamic objects. However, their performance significantly deteriorates when deployed in settings with a higher presence of movable objects, such as environments with pedestrians, cars, and buses, which are common in outdoor scenes. To address this issue, we propose a Multilayer Perceptron (MLP)-based real-time stereo SLAM system that leverages complete geometry information to avoid information loss. Moreover, there is currently no publicly available dataset for directly evaluating the effectiveness of dynamic and static feature classification methods, and to bridge this gap, we have created a publicly available dataset containing over 50,000 feature points. Experimental results demonstrate that our MLP-based dynamic and static feature point discriminator has achieved superior performance compared to other methods on this dataset. Furthermore, the MLP-based real-time stereo SLAM system has shown the highest average precision and fastest speed on the outdoor KITTI tracking datasets compared to other dynamic SLAM systems.The open-source code and datasets are available at https://github.com/TaozheLi/MLP-SLAM.
comment: Dynamic SLAM
Navigation under uncertainty: Trajectory prediction and occlusion reasoning with switching dynamical systems
Predicting future trajectories of nearby objects, especially under occlusion, is a crucial task in autonomous driving and safe robot navigation. Prior works typically neglect to maintain uncertainty about occluded objects and only predict trajectories of observed objects using high-capacity models such as Transformers trained on large datasets. While these approaches are effective in standard scenarios, they can struggle to generalize to the long-tail, safety-critical scenarios. In this work, we explore a conceptual framework unifying trajectory prediction and occlusion reasoning under the same class of structured probabilistic generative model, namely, switching dynamical systems. We then present some initial experiments illustrating its capabilities using the Waymo open dataset.
DR-MPC: Deep Residual Model Predictive Control for Real-world Social Navigation
How can a robot safely navigate around people exhibiting complex motion patterns? Reinforcement Learning (RL) or Deep RL (DRL) in simulation holds some promise, although much prior work relies on simulators that fail to precisely capture the nuances of real human motion. To address this gap, we propose Deep Residual Model Predictive Control (DR-MPC), a method to enable robots to quickly and safely perform DRL from real-world crowd navigation data. By blending MPC with model-free DRL, DR-MPC overcomes the traditional DRL challenges of large data requirements and unsafe initial behavior. DR-MPC is initialized with MPC-based path tracking, and gradually learns to interact more effectively with humans. To further accelerate learning, a safety component estimates when the robot encounters out-of-distribution states and guides it away from likely collisions. In simulation, we show that DR-MPC substantially outperforms prior work, including traditional DRL and residual DRL models. Real-world experiments show our approach successfully enables a robot to navigate a variety of crowded situations with few errors using less than 4 hours of training data.
comment: 8 pages, 8 figures, under review for IEEE Robotics and Automation Letters (RA-L)
Traversability-Aware Legged Navigation by Learning from Real-World Visual Data
The enhanced mobility brought by legged locomotion empowers quadrupedal robots to navigate through complex and unstructured environments. However, optimizing agile locomotion while accounting for the varying energy costs of traversing different terrains remains an open challenge. Most previous work focuses on planning trajectories with traversability cost estimation based on human-labeled environmental features. However, this human-centric approach is insufficient because it does not account for the varying capabilities of the robot locomotion controllers over challenging terrains. To address this, we develop a novel traversability estimator in a robot-centric manner, based on the value function of the robot's locomotion controller. This estimator is integrated into a new learning-based RGBD navigation framework. The framework develops a planner that guides the robot in avoiding obstacles and hard-to-traverse terrains while reaching its goals. The training of the navigation planner is directly performed in the real world using a sample efficient reinforcement learning method. Through extensive benchmarking, we demonstrate that the proposed framework achieves the best performance in accurate traversability cost estimation and efficient learning from multi-modal data (the robot's color and depth vision, and proprioceptive feedback) for real-world training. Using the proposed method, a quadrupedal robot learns to perform traversability-aware navigation through trial and error in various real-world environments with challenging terrains that are difficult to classify using depth vision alone.
Fully Asynchronous Neuromorphic Perception for Mobile Robot Dodging with Loihi Chips
Sparse and asynchronous sensing and processing in natural organisms lead to ultra low-latency and energy-efficient perception. Event cameras, known as neuromorphic vision sensors, are designed to mimic these characteristics. However, fully utilizing the sparse and asynchronous event stream remains challenging. Influenced by the mature algorithms of standard cameras, most existing event-based algorithms still rely on the "group of events" processing paradigm (e.g., event frames, 3D voxels) when handling event streams. This paradigm encounters issues such as feature loss, event stacking, and high computational burden, which deviates from the intended purpose of event cameras. To address these issues, we propose a fully asynchronous neuromorphic paradigm that integrates event cameras, spiking networks, and neuromorphic processors (Intel Loihi). This paradigm can faithfully process each event asynchronously as it arrives, mimicking the spike-driven signal processing in biological brains. We compare the proposed paradigm with the existing "group of events" processing paradigm in detail on the real mobile robot dodging task. Experimental results show that our scheme exhibits better robustness than frame-based methods with different time windows and light conditions. Additionally, the energy consumption per inference of our scheme on the embedded Loihi processor is only 4.30% of that of the event spike tensor method on NVIDIA Jetson Orin NX with energy-saving mode, and 1.64% of that of the event frame method on the same neuromorphic processor. As far as we know, this is the first time that a fully asynchronous neuromorphic paradigm has been implemented for solving sequential tasks on real mobile robot.
Ergodic Trajectory Optimization on Generalized Domains Using Maximum Mean Discrepancy ICRA 2025
We present a novel formulation of ergodic trajectory optimization that can be specified over general domains using kernel maximum mean discrepancy. Ergodic trajectory optimization is an effective approach that generates coverage paths for problems related to robotic inspection, information gathering problems, and search and rescue. These optimization schemes compel the robot to spend time in a region proportional to the expected utility of visiting that region. Current methods for ergodic trajectory optimization rely on domain-specific knowledge, e.g., a defined utility map, and well-defined spatial basis functions to produce ergodic trajectories. Here, we present a generalization of ergodic trajectory optimization based on maximum mean discrepancy that requires only samples from the search domain. We demonstrate the ability of our approach to produce coverage trajectories on a variety of problem domains including robotic inspection of objects with differential kinematics constraints and on Lie groups without having access to domain specific knowledge. Furthermore, we show favorable computational scaling compared to existing state-of-the-art methods for ergodic trajectory optimization with a trade-off between domain specific knowledge and computational scaling, thus extending the versatility of ergodic coverage on a wider application domain.
comment: 6 pages (excluding references), 1 table, 8 figures, submitted to ICRA 2025
Words to Wheels: Vision-Based Autonomous Driving Understanding Human Language Instructions Using Foundation Models
This paper introduces an innovative application of foundation models, enabling Unmanned Ground Vehicles (UGVs) equipped with an RGB-D camera to navigate to designated destinations based on human language instructions. Unlike learning-based methods, this approach does not require prior training but instead leverages existing foundation models, thus facilitating generalization to novel environments. Upon receiving human language instructions, these are transformed into a 'cognitive route description' using a large language model (LLM)-a detailed navigation route expressed in human language. The vehicle then decomposes this description into landmarks and navigation maneuvers. The vehicle also determines elevation costs and identifies navigability levels of different regions through a terrain segmentation model, GANav, trained on open datasets. Semantic elevation costs, which take both elevation and navigability levels into account, are estimated and provided to the Model Predictive Path Integral (MPPI) planner, responsible for local path planning. Concurrently, the vehicle searches for target landmarks using foundation models, including YOLO-World and EfficientViT-SAM. Ultimately, the vehicle executes the navigation commands to reach the designated destination, the final landmark. Our experiments demonstrate that this application successfully guides UGVs to their destinations following human language instructions in novel environments, such as unfamiliar terrain or urban settings.
comment: 7 pages, 7 figures
Reflexive Input-Output Causality Mechanisms
This paper explores the concept of reflexive actuation, examining how robots may leverage both internal and external stimuli to trigger changes in the motion, performance, or physical characteristics of the robot, such as its size, shape, or configuration, and so on. These changes themselves may in turn be sequentially re-used as input to drive further adaptations. Drawing inspiration from biological systems, where reflexes are an essential component of the response to environmental changes, reflexive actuation is critical to enable robots to adapt to diverse situations and perform complex tasks. The underlying principles of reflexive actuation are analyzed, with examples provided from existing implementations such as contact-sensitive reflexive arms, physical counters, and their applications. The paper also outlines future directions and challenges for advancing this research area, emphasizing its significance in the development of adaptive, responsive robotic systems.
comment: 9 pages, 5 figures
ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection
This paper introduces ROSAR, a novel framework enhancing the robustness of deep learning object detection models tailored for side-scan sonar (SSS) images, generated by autonomous underwater vehicles using sonar sensors. By extending our prior work on knowledge distillation (KD), this framework integrates KD with adversarial retraining to address the dual challenges of model efficiency and robustness against SSS noises. We introduce three novel, publicly available SSS datasets, capturing different sonar setups and noise conditions. We propose and formalize two SSS safety properties and utilize them to generate adversarial datasets for retraining. Through a comparative analysis of projected gradient descent (PGD) and patch-based adversarial attacks, ROSAR demonstrates significant improvements in model robustness and detection accuracy under SSS-specific conditions, enhancing the model's robustness by up to 1.85%. ROSAR is available at https://github.com/remaro-network/ROSAR-framework.
Exploiting Local Features and Range Images for Small Data Real-Time Point Cloud Semantic Segmentation IROS
Semantic segmentation of point clouds is an essential task for understanding the environment in autonomous driving and robotics. Recent range-based works achieve real-time efficiency, while point- and voxel-based methods produce better results but are affected by high computational complexity. Moreover, highly complex deep learning models are often not suited to efficiently learn from small datasets. Their generalization capabilities can easily be driven by the abundance of data rather than the architecture design. In this paper, we harness the information from the three-dimensional representation to proficiently capture local features, while introducing the range image representation to incorporate additional information and facilitate fast computation. A GPU-based KDTree allows for rapid building, querying, and enhancing projection with straightforward operations. Extensive experiments on SemanticKITTI and nuScenes datasets demonstrate the benefits of our modification in a ``small data'' setup, in which only one sequence of the dataset is used to train the models, but also in the conventional setup, where all sequences except one are used for training. We show that a reduced version of our model not only demonstrates strong competitiveness against full-scale state-of-the-art models but also operates in real-time, making it a viable choice for real-world case applications. The code of our method is available at https://github.com/Bender97/WaffleAndRange.
comment: This paper has been accepted for publication at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Reinforcement Learning For Quadrupedal Locomotion: Current Advancements And Future Perspectives
In recent years, reinforcement learning (RL) based quadrupedal locomotion control has emerged as an extensively researched field, driven by the potential advantages of autonomous learning and adaptation compared to traditional control methods. This paper provides a comprehensive study of the latest research in applying RL techniques to develop locomotion controllers for quadrupedal robots. We present a detailed overview of the core concepts, methodologies, and key advancements in RL-based locomotion controllers, including learning algorithms, training curricula, reward formulations, and simulation-to-real transfer techniques. The study covers both gait-bound and gait-free approaches, highlighting their respective strengths and limitations. Additionally, we discuss the integration of these controllers with robotic hardware and the role of sensor feedback in enabling adaptive behavior. The paper also outlines future research directions, such as incorporating exteroceptive sensing, combining model-based and model-free techniques, and developing online learning capabilities. Our study aims to provide researchers and practitioners with a comprehensive understanding of the state-of-the-art in RL-based locomotion controllers, enabling them to build upon existing work and explore novel solutions for enhancing the mobility and adaptability of quadrupedal robots in real-world environments.
comment: 12 pages, 3 figures
SMART-TRACK: A Novel Kalman Filter-Guided Sensor Fusion For Robust UAV Object Tracking in Dynamic Environments
In the field of sensor fusion and state estimation for object detection and localization, ensuring accurate tracking in dynamic environments poses significant challenges. Traditional methods like the Kalman Filter (KF) often fail when measurements are intermittent, leading to rapid divergence in state estimations. To address this, we introduce SMART (Sensor Measurement Augmentation and Reacquisition Tracker), a novel approach that leverages high-frequency state estimates from the KF to guide the search for new measurements, maintaining tracking continuity even when direct measurements falter. This is crucial for dynamic environments where traditional methods struggle. Our contributions include: 1) Versatile Measurement Augmentation Using KF Feedback: We implement a versatile measurement augmentation system that serves as a backup when primary object detectors fail intermittently. This system is adaptable to various sensors, demonstrated using depth cameras where KF's 3D predictions are projected into 2D depth image coordinates, integrating nonlinear covariance propagation techniques simplified to first-order approximations. 2) Open-source ROS2 Implementation: We provide an open-source ROS2 implementation of the SMART-TRACK framework, validated in a realistic simulation environment using Gazebo and ROS2, fostering broader adaptation and further research. Our results showcase significant enhancements in tracking stability, with estimation RMSE as low as 0.04 m during measurement disruptions, advancing the robustness of UAV tracking and expanding the potential for reliable autonomous UAV operations in complex scenarios. The implementation is available at https://github.com/mzahana/SMART-TRACK.
comment: 12 pages, 7 figures, 3 algorithms, 2 tables
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024
Language-guided robotic manipulation is a challenging task that requires an embodied agent to follow abstract user instructions to accomplish various complex manipulation tasks. Previous work trivially fitting the data without revealing the relation between instruction and low-level executable actions, these models are prone to memorizing the surficial pattern of the data instead of acquiring the transferable knowledge, and thus are fragile to dynamic environment changes. To address this issue, we propose a PrIrmitive-driVen waypOinT-aware world model for Robotic manipulation (PIVOT-R) that focuses solely on the prediction of task-relevant waypoints. Specifically, PIVOT-R consists of a Waypoint-aware World Model (WAWM) and a lightweight action prediction module. The former performs primitive action parsing and primitive-driven waypoint prediction, while the latter focuses on decoding low-level actions. Additionally, we also design an asynchronous hierarchical executor (AHE), which can use different execution frequencies for different modules of the model, thereby helping the model reduce computational redundancy and improve model execution efficiency. Our PIVOT-R outperforms state-of-the-art (SoTA) open-source models on the SeaWave benchmark, achieving an average relative improvement of 19.45% across four levels of instruction tasks. Moreover, compared to the synchronously executed PIVOT-R, the execution efficiency of PIVOT-R with AHE is increased by 28-fold, with only a 2.9% drop in performance. These results provide compelling evidence that our PIVOT-R can significantly improve both the performance and efficiency of robotic manipulation.
comment: Accepted to NeurIPS 2024
Efficiently Obtaining Reachset Conformance for the Formal Analysis of Robotic Contact Tasks IROS 2024
Formal verification of robotic tasks requires a simple yet conformant model of the used robot. We present the first work on generating reachset conformant models for robotic contact tasks considering hybrid (mixed continuous and discrete) dynamics. Reachset conformance requires that the set of reachable outputs of the abstract model encloses all previous measurements to transfer safety properties. Aiming for industrial applications, we describe the system using a simple hybrid automaton with linear dynamics. We inject non-determinism into the continuous dynamics and the discrete transitions, and we optimally identify all model parameters together with the non-determinism required to capture the recorded behaviors. Using two 3-DOF robots, we show that our approach can effectively generate models to capture uncertainties in system behavior and substantially reduce the required testing effort in industrial applications.
comment: Accepted at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
HumanFT: A Human-like Fingertip Multimodal Visuo-Tactile Sensor
Tactile sensors play a crucial role in enabling robots to interact effectively and safely with objects in everyday tasks. In particular, visuotactile sensors have seen increasing usage in two and three-fingered grippers due to their high-quality feedback. However, a significant gap remains in the development of sensors suitable for humanoid robots, especially five-fingered dexterous hands. One reason is because of the challenges in designing and manufacturing sensors that are compact in size. In this paper, we propose HumanFT, a multimodal visuotactile sensor that replicates the shape and functionality of a human fingertip. To bridge the gap between human and robotic tactile sensing, our sensor features real-time force measurements, high-frequency vibration detection, and overtemperature alerts. To achieve this, we developed a suite of fabrication techniques for a new type of elastomer optimized for force propagation and temperature sensing. Besides, our sensor integrates circuits capable of sensing pressure and vibration. These capabilities have been validated through experiments. The proposed design is simple and cost-effective to fabricate. We believe HumanFT can enhance humanoid robots' perception by capturing and interpreting multimodal tactile information.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Preliminary Evaluation of an Ultrasound-Guided Robotic System for Autonomous Percutaneous Intervention
Cancer cases have been rising globally, resulting in nearly 10 million deaths in 2023. Biopsy, crucial for diagnosis, is often performed under ultrasound (US) guidance, demanding precise hand coordination and cognitive decision-making. Robot-assisted interventions have shown improved accuracy in lesion targeting by addressing challenges such as noisy 2D images and maintaining consistent probe-to-surface contact. Recent research has focused on fully autonomous robotic US systems to enable standardized diagnostic procedures and reproducible US-guided therapy. This study presents a fully autonomous system for US-guided needle placement capable of performing end-to-end clinical workflow. The system autonomously: 1) identifies the liver region on the patient's abdomen surface, 2) plans and executes the US scanning path using impedance control, 3) localizes lesions from the US images in real-time, and 4) targets the identified lesions, all without human intervention. This study evaluates both position and impedance-controlled systems. Validation on agar phantoms demonstrated a targeting error of 5.74 +- 2.70 mm, highlighting its potential for accurately targeting tumors larger than 5 mm. Achieved results show its potential for a fully autonomous system for US-guided biopsies.
comment: 7 pages and 6 figures
Trust or Bust: Ensuring Trustworthiness in Autonomous Weapon Systems
The integration of Autonomous Weapon Systems (AWS) into military operations presents both significant opportunities and challenges. This paper explores the multifaceted nature of trust in AWS, emphasising the necessity of establishing reliable and transparent systems to mitigate risks associated with bias, operational failures, and accountability. Despite advancements in Artificial Intelligence (AI), the trustworthiness of these systems, especially in high-stakes military applications, remains a critical issue. Through a systematic review of existing literature, this research identifies gaps in the understanding of trust dynamics during the development and deployment phases of AWS. It advocates for a collaborative approach that includes technologists, ethicists, and military strategists to address these ongoing challenges. The findings underscore the importance of Human-Machine teaming and enhancing system intelligibility to ensure accountability and adherence to International Humanitarian Law. Ultimately, this paper aims to contribute to the ongoing discourse on the ethical implications of AWS and the imperative for trustworthy AI in defense contexts.
comment: Accepted as a workshop paper at MILCOM 2024, 8 pages
Kinematic-ICP: Enhancing LiDAR Odometry with Kinematic Constraints for Wheeled Mobile Robots Moving on Planar Surfaces ICRA 2025
LiDAR odometry is essential for many robotics applications, including 3D mapping, navigation, and simultaneous localization and mapping. LiDAR odometry systems are usually based on some form of point cloud registration to compute the ego-motion of a mobile robot. Yet, few of today's LiDAR odometry systems consider the domain-specific knowledge and the kinematic model of the mobile platform during the point cloud alignment. In this paper, we present Kinematic-ICP, a LiDAR odometry system that focuses on wheeled mobile robots equipped with a 3D LiDAR and moving on a planar surface, which is a common assumption for warehouses, offices, hospitals, etc. Our approach introduces kinematic constraints within the optimization of a traditional point-to-point iterative closest point scheme. In this way, the resulting motion follows the kinematic constraints of the platform, effectively exploiting the robot's wheel odometry and the 3D LiDAR observations. We dynamically adjust the influence of LiDAR measurements and wheel odometry in our optimization scheme, allowing the system to handle degenerate scenarios such as feature-poor corridors. We evaluate our approach on robots operating in large-scale warehouse environments, but also outdoors. The experiments show that our approach achieves top performances and is more accurate than wheel odometry and common LiDAR odometry systems. Kinematic-ICP has been recently deployed in the Dexory fleet of robots operating in warehouses worldwide at their customers' sites, showing that our method can run in the real world alongside a complete navigation stack.
comment: Submitted to ICRA 2025
A Surface Adaptive First-Look Inspection Planner for Autonomous Remote Sensing of Open-Pit Mines
In this work, we present an autonomous inspection framework for remote sensing tasks in active open-pit mines. Specifically, the contributions are focused towards developing a methodology where an initial approximate operator-defined inspection plan is exploited by an online view-planner to predict an inspection path that can adapt to changes in the current mine-face morphology caused by route mining activities. The proposed inspection framework leverages instantaneous 3D LiDAR and localization measurements coupled with modelled sensor footprint for view-planning satisfying desired viewing and photogrammetric conditions. The efficacy of the proposed framework has been demonstrated through simulation in Feiring-Bruk open-pit mine environment and hardware-based outdoor experimental trials. The video showcasing the performance of the proposed work can be found here: https://youtu.be/uWWbDfoBvFc
comment: Accepted for publication in IEEE ROBIO 2024
Signage-Aware Exploration in Open World using Venue Maps
Current exploration methods struggle to search for shops in unknown open-world environments due to a lack of prior knowledge and text recognition capabilities. Venue maps offer valuable information that can aid exploration planning by correlating scene signage with map data. However, the arbitrary shapes and styles of the text on signage, along with multi-view inconsistencies, pose significant challenges for accurate recognition by robots. Additionally, the discrepancies between real-world environments and venue maps hinder the incorporation of text information into planners. This paper introduces a novel signage-aware exploration system to address these challenges, enabling the robot to utilize venue maps effectively. We propose a signage understanding method that accurately detects and recognizes the text on signage using a diffusion-based text instance retrieval method combined with a 2D-to-3D semantic fusion strategy. Furthermore, we design a venue map-guided exploration-exploitation planner that balances exploration in unknown regions using a directional heuristic derived from venue maps with exploitation to get close and adjust orientation for better recognition. Experiments in large-scale shopping malls demonstrate our method's superior signage recognition accuracy and coverage efficiency, outperforming state-of-the-art scene text spotting methods and traditional exploration methods.
comment: 8 pages, 9 figures, 4 tables, under review
Innovative Deep Learning Techniques for Obstacle Recognition: A Comparative Study of Modern Detection Algorithms
This study explores a comprehensive approach to obstacle detection using advanced YOLO models, specifically YOLOv8, YOLOv7, YOLOv6, and YOLOv5. Leveraging deep learning techniques, the research focuses on the performance comparison of these models in real-time detection scenarios. The findings demonstrate that YOLOv8 achieves the highest accuracy with improved precision-recall metrics. Detailed training processes, algorithmic principles, and a range of experimental results are presented to validate the model's effectiveness.
The Ingredients for Robotic Diffusion Transformers
In recent years roboticists have achieved remarkable progress in solving increasingly general tasks on dexterous robotic hardware by leveraging high capacity Transformer network architectures and generative diffusion models. Unfortunately, combining these two orthogonal improvements has proven surprisingly difficult, since there is no clear and well-understood process for making important design choices. In this paper, we identify, study and improve key architectural design decisions for high-capacity diffusion transformer policies. The resulting models can efficiently solve diverse tasks on multiple robot embodiments, without the excruciating pain of per-setup hyper-parameter tuning. By combining the results of our investigation with our improved model components, we are able to present a novel architecture, named \method, that significantly outperforms the state of the art in solving long-horizon ($1500+$ time-steps) dexterous tasks on a bi-manual ALOHA robot. In addition, we find that our policies show improved scaling performance when trained on 10 hours of highly multi-modal, language annotated ALOHA demonstration data. We hope this work will open the door for future robot learning techniques that leverage the efficiency of generative diffusion modeling with the scalability of large scale transformer architectures. Code, robot dataset, and videos are available at: https://dit-policy.github.io
NeRF-enabled Analysis-Through-Synthesis for ISAR Imaging of Small Everyday Objects with Sparse and Noisy UWB Radar Data
Inverse Synthetic Aperture Radar (ISAR) imaging presents a formidable challenge when it comes to small everyday objects due to their limited Radar Cross-Section (RCS) and the inherent resolution constraints of radar systems. Existing ISAR reconstruction methods including backprojection (BP) often require complex setups and controlled environments, rendering them impractical for many real-world noisy scenarios. In this paper, we propose a novel Analysis-through-Synthesis (ATS) framework enabled by Neural Radiance Fields (NeRF) for high-resolution coherent ISAR imaging of small objects using sparse and noisy Ultra-Wideband (UWB) radar data with an inexpensive and portable setup. Our end-to-end framework integrates ultra-wideband radar wave propagation, reflection characteristics, and scene priors, enabling efficient 2D scene reconstruction without the need for costly anechoic chambers or complex measurement test beds. With qualitative and quantitative comparisons, we demonstrate that the proposed method outperforms traditional techniques and generates ISAR images of complex scenes with multiple targets and complex structures in Non-Line-of-Sight (NLOS) and noisy scenarios, particularly with limited number of views and sparse UWB radar scans. This work represents a significant step towards practical, cost-effective ISAR imaging of small everyday objects, with broad implications for robotics and mobile sensing applications.
Dreaming to Assist: Learning to Align with Human Objectives for Shared Control in High-Speed Racing
Tight coordination is required for effective human-robot teams in domains involving fast dynamics and tactical decisions, such as multi-car racing. In such settings, robot teammates must react to cues of a human teammate's tactical objective to assist in a way that is consistent with the objective (e.g., navigating left or right around an obstacle). To address this challenge, we present Dream2Assist, a framework that combines a rich world model able to infer human objectives and value functions, and an assistive agent that provides appropriate expert assistance to a given human teammate. Our approach builds on a recurrent state space model to explicitly infer human intents, enabling the assistive agent to select actions that align with the human and enabling a fluid teaming interaction. We demonstrate our approach in a high-speed racing domain with a population of synthetic human drivers pursuing mutually exclusive objectives, such as "stay-behind" and "overtake". We show that the combined human-robot team, when blending its actions with those of the human, outperforms the synthetic humans alone as well as several baseline assistance strategies, and that intent-conditioning enables adherence to human preferences during task execution, leading to improved performance while satisfying the human's objective.
comment: Accepted to CoRL 2024, Munich, Germany
Embodied Active Learning of Generative Sensor-Object Models
When a robot encounters a novel object, how should it respond$\unicode{x2014}$what data should it collect$\unicode{x2014}$so that it can find the object in the future? In this work, we present a method for learning image features of an unknown number of novel objects. To do this, we use active coverage with respect to latent uncertainties of the novel descriptions. We apply ergodic stability and PAC-Bayes theory to extend statistical guarantees for VAEs to embodied agents. We demonstrate the method in hardware with a robotic arm; the pipeline is also implemented in a simulated environment. Algorithms and simulation are available open source, see http://sites.google.com/u.northwestern.edu/embodied-learning-hardware .
comment: 16 pages, International Symposium of Robotics Research (ISRR) 2024
HoloSpot: Intuitive Object Manipulation via Mixed Reality Drag-and-Drop ICRA 2025
Human-robot interaction through mixed reality (MR) technologies enables novel, intuitive interfaces to control robots in remote operations. Such interfaces facilitate operations in hazardous environments, where human presence is risky, yet human oversight remains crucial. Potential environments include disaster response scenarios and areas with high radiation or toxic chemicals. In this paper we present an interface system projecting a 3D representation of a scanned room as a scaled-down 'dollhouse' hologram, allowing users to select and manipulate objects using a straightforward drag-and-drop interface. We then translate these drag-and-drop user commands into real-time robot actions based on the recent Spot-Compose framework. The Unity-based application provides an interactive tutorial and a user-friendly experience, ensuring ease of use. Through comprehensive end-to-end testing, we validate the system's capability in executing pick-and-place tasks and a complementary user study affirms the interface's intuitive controls. Our findings highlight the advantages of this interface in improving user experience and operational efficiency. This work lays the groundwork for a robust framework that advances the potential for seamless human-robot collaboration in diverse applications. Paper website: https://holospot.github.io/
comment: 6 pages, 8 figures, submitted to ICRA 2025
What Am I? Evaluating the Effect of Language Fluency and Task Competency on the Perception of a Social Robot
Recent advancements in robot capabilities have enabled them to interact with people in various human-social environments (HSEs). In many of these environments, the perception of the robot often depends on its capabilities, e.g., task competency, language fluency, etc. To enable fluent human-robot interaction (HRI) in HSEs, it is crucial to understand the impact of these capabilities on the perception of the robot. Although many works have investigated the effects of various robot capabilities on the robot's perception separately, in this paper, we present a large-scale HRI study (n = 60) to investigate the combined impact of both language fluency and task competency on the perception of a robot. The results suggest that while language fluency may play a more significant role than task competency in the perception of the verbal competency of a robot, both language fluency and task competency contribute to the perception of the intelligence and reliability of the robot. The results also indicate that task competency may play a more significant role than language fluency in the perception of meeting expectations and being a good teammate. The findings of this study highlight the relationship between language fluency and task competency in the context of social HRI and will enable the development of more intelligent robots in the future.
comment: Accepted at the IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2024
NAR-*ICP: Neural Execution of Classical ICP-based Pointcloud Registration Algorithms
This study explores the intersection of neural networks and classical robotics algorithms through the Neural Algorithmic Reasoning (NAR) framework, allowing to train neural networks to effectively reason like classical robotics algorithms by learning to execute them. Algorithms are integral to robotics and safety-critical applications due to their predictable and consistent performance through logical and mathematical principles. In contrast, while neural networks are highly adaptable, handling complex, high-dimensional data and generalising across tasks, they often lack interpretability and transparency in their internal computations. We propose a Graph Neural Network (GNN)-based learning framework, NAR-*ICP, which learns the intermediate algorithmic steps of classical ICP-based pointcloud registration algorithms, and extend the CLRS Algorithmic Reasoning Benchmark with classical robotics perception algorithms. We evaluate our approach across diverse datasets, from real-world to synthetic, demonstrating its flexibility in handling complex and noisy inputs, along with its potential to be used as part of a larger learning system. Our results indicate that our method achieves superior performance across all benchmarks and datasets, consistently surpassing even the algorithms it has been trained on, further demonstrating its ability to generalise beyond the capabilities of traditional algorithms.
comment: 17 pages, 9 figures
GSRM: Building Roadmaps for Query-Efficient and Near-Optimal Path Planning Using a Reaction Diffusion System IROS 2024
Mobile robots frequently navigate on roadmaps, i.e., graphs where edges represent safe motions, in applications such as healthcare, hospitality, and warehouse automation. Often the environment is quasi-static, i.e., it is sufficient to construct a roadmap once and then use it for any future planning queries. Roadmaps are typically used with graph search algorithm to find feasible paths for the robots. Therefore, the roadmap should be well-connected, and graph searches should produce near-optimal solutions with short solution paths while simultaneously be computationally efficient to execute queries quickly. We propose a new method to construct roadmaps based on the Gray-Scott reaction diffusion system and Delaunay triangulation. Our approach, GSRM, produces roadmaps with evenly distributed vertices and edges that are well-connected even in environments with challenging narrow passages. Empirically, we compare to classical roadmaps generated by 8-connected grids, probabilistic roadmaps (PRM, SPARS2), and optimized roadmap graphs (ORM). Our results show that GSRM consistently produces superior roadmaps that are well-connected, have high query efficiency, and result in short solution paths.
comment: Presented at IROS 2024
Safety-critical Motion Planning for Collaborative Legged Loco-Manipulation over Discrete Terrain
As legged robots are deployed in industrial and autonomous construction tasks requiring collaborative manipulation, they must handle object manipulation while maintaining stable locomotion. The challenge intensifies in real-world environments, where they should traverse discrete terrain, avoid obstacles, and coordinate with other robots for safe loco-manipulation. This work addresses safe motion planning for collaborative manipulation of an unknown payload on discrete terrain while avoiding obstacles. Our approach uses two sets of model predictive controllers (MPCs) as motion planners: a global MPC generates a safe trajectory for the team with obstacle avoidance, while decentralized MPCs for each robot ensure safe footholds on discrete terrain as they follow the global trajectory. A model reference adaptive whole-body controller (MRA-WBC) then tracks the desired path, compensating for model uncertainties from the unknown payload. We validated our method in simulation and hardware on a team of Unitree robots. The results demonstrate that our approach successfully guides the team through obstacle courses, requiring planar positioning and height adjustments, and all happening on discrete terrain such as stepping stones.
Intramuscular High-Density Micro-Electrode Arrays Enable High-Precision Decoding and Mapping of Spinal Motor Neurons to Reveal Hand Control
Decoding nervous system activity is a key challenge in neuroscience and neural interfacing. In this study, we propose a novel neural decoding system that enables unprecedented large-scale sampling of muscle activity. Using micro-electrode arrays with more than 100 channels embedded within the forearm muscles, we recorded high-density signals that captured multi-unit motor neuron activity. This extensive sampling was complemented by advanced methods for neural decomposition, analysis, and classification, allowing us to accurately detect and interpret the spiking activity of spinal motor neurons that innervate hand muscles. We evaluated this system in two healthy participants, each implanted with three electromyogram (EMG) micro-electrode arrays (comprising 40 electrodes each) in the forearm. These arrays recorded muscle activity during both single- and multi-digit isometric contractions. For the first time under controlled conditions, we demonstrate that multi-digit tasks elicit unique patterns of motor neuron recruitment specific to each task, rather than employing combinations of recruitment patterns from single-digit tasks. This observation led us to hypothesize that hand tasks could be classified with high precision based on the decoded neural activity. We achieved perfect classification accuracy (100%) across 12 distinct single- and multi-digit tasks, and consistently high accuracy (>96\%) across all conditions and subjects, for up to 16 task classes. These results significantly outperformed conventional EMG classification methods. The exceptional performance of this system paves the way for developing advanced neural interfaces based on invasive high-density EMG technology. This innovation could greatly enhance human-computer interaction and lead to substantial improvements in assistive technologies, offering new possibilities for restoring motor function in clinical applications.
Incorporating Task Progress Knowledge for Subgoal Generation in Robotic Manipulation through Image Edits
Understanding the progress of a task allows humans to not only track what has been done but also to better plan for future goals. We demonstrate TaKSIE, a novel framework that incorporates task progress knowledge into visual subgoal generation for robotic manipulation tasks. We jointly train a recurrent network with a latent diffusion model to generate the next visual subgoal based on the robot's current observation and the input language command. At execution time, the robot leverages a visual progress representation to monitor the task progress and adaptively samples the next visual subgoal from the model to guide the manipulation policy. We train and validate our model in simulated and real-world robotic tasks, achieving state-of-the-art performance on the CALVIN manipulation benchmark. We find that the inclusion of task progress knowledge can improve the robustness of trained policy for different initial robot poses or various movement speeds during demonstrations. The project website can be found at https://live-robotics-uva.github.io/TaKSIE/ .
comment: 11 pages, 9 figures
V2I-Calib++: A Multi-terminal Spatial Calibration Approach in Urban Intersections for Collaborative Perception
Urban intersections, dense with pedestrian and vehicular traffic and compounded by GPS signal obstructions from high-rise buildings, are among the most challenging areas in urban traffic systems. Traditional single-vehicle intelligence systems often perform poorly in such environments due to a lack of global traffic flow information and the ability to respond to unexpected events. Vehicle-to-Everything (V2X) technology, through real-time communication between vehicles (V2V) and vehicles to infrastructure (V2I), offers a robust solution. However, practical applications still face numerous challenges. Calibration among heterogeneous vehicle and infrastructure endpoints in multi-end LiDAR systems is crucial for ensuring the accuracy and consistency of perception system data. Most existing multi-end calibration methods rely on initial calibration values provided by positioning systems, but the instability of GPS signals due to high buildings in urban canyons poses severe challenges to these methods. To address this issue, this paper proposes a novel multi-end LiDAR system calibration method that does not require positioning priors to determine initial external parameters and meets real-time requirements. Our method introduces an innovative multi-end perception object association technique, utilizing a new Overall Distance metric (oDist) to measure the spatial association between perception objects, and effectively combines global consistency search algorithms with optimal transport theory. By this means, we can extract co-observed targets from object association results for further external parameter computation and optimization. Extensive comparative and ablation experiments conducted on the simulated dataset V2X-Sim and the real dataset DAIR-V2X confirm the effectiveness and efficiency of our method. The code for this method can be accessed at: \url{https://github.com/MassimoQu/v2i-calib}.
Learning Quadruped Locomotion Using Differentiable Simulation
This work explores the potential of using differentiable simulation for learning quadruped locomotion. Differentiable simulation promises fast convergence and stable training by computing low-variance first-order gradients using robot dynamics. However, its usage for legged robots is still limited to simulation. The main challenge lies in the complex optimization landscape of robotic tasks due to discontinuous dynamics. This work proposes a new differentiable simulation framework to overcome these challenges. Our approach combines a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. This approach maintains simulation accuracy by aligning the robot states from the surrogate model with those of the precise, non-differentiable simulator. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. We demonstrate that differentiable simulation outperforms a reinforcement learning algorithm (PPO) by achieving significantly better sample efficiency while maintaining its effectiveness in handling large-scale environments. Our method represents one of the first successful applications of differentiable simulation to real-world quadruped locomotion, offering a compelling alternative to traditional RL methods.
comment: 8th Annual Conference on Robot Learning (CoRL)
Constrained Trajectory Optimization on Matrix Lie Groups via Lie-Algebraic Differential Dynamic Programming
Matrix Lie groups are an important class of manifolds commonly used in control and robotics, and optimizing control policies on these manifolds is a fundamental problem. In this work, we propose a novel computationally efficient approach for trajectory optimization on matrix Lie groups using an augmented Lagrangian-based constrained discrete Differential Dynamic Programming (DDP). The method involves lifting the optimization problem to the Lie algebra during the backward pass and retracting back to the manifold during the forward pass. Unlike previous approaches that addressed constraint handling only for specific classes of matrix Lie groups, the proposed method provides a general solution for nonlinear constraint handling across generic matrix Lie groups. We evaluate the effectiveness of the proposed DDP method in handling constraints within a mechanical system characterized by rigid body dynamics in SE(3), assessing its computational efficiency compared to existing direct optimization solvers. Additionally, the method demonstrates robustness under external disturbances when applied as a Lie-algebraic feedback control policy on SE(3), and in optimizing a quadrotor's trajectory in a challenging realistic scenario. Experiments show that the proposed approach effectively manages general constraints defined on configuration, velocity, and inputs during optimization, while also maintaining stability under external disturbances when executing the resultant control policy in closed-loop.
comment: 12 pages, 6 figures
Towards Generalist Robot Learning from Internet Video: A Survey
Scaling deep learning to huge internet-scraped datasets has yielded remarkably general capabilities in natural language processing and visual understanding and generation. In contrast, data is scarce and expensive to collect in robotics. This has seen robot learning struggle to match the generality of capabilities observed in other domains. Learning from Videos (LfV) methods seek to address this data bottleneck by augmenting traditional robot data with large internet-scraped video datasets. Such video data may provide the model with foundational information regarding physical behaviours and the physics of the world. This holds great promise for improving the generality of our robots. In this survey, we present an overview of the emerging field of LfV. We outline fundamental concepts, including the benefits and challenges of LfV. We provide a comprehensive review of current methods for: extracting knowledge from large-scale internet video; tackling key LfV challenges; and boosting downstream reinforcement and robot learning via the use of video data. LfV datasets and benchmarks are also reviewed. The survey closes with a critical discussion of challenges and opportunities. Here, we advocate for scalable foundation model approaches that can leverage the full range of available internet video to aid the learning of robot policies and dynamics models. We hope this survey can inform and catalyse further LfV research, facilitating progress towards the development of general-purpose robots.
comment: Refactored paper structure, significantly reduced paper length, rewritten abstract and introduction. Other minor improvements
Scaling Manipulation Learning with Visual Kinematic Chain Prediction
Learning general-purpose models from diverse datasets has achieved great success in machine learning. In robotics, however, existing methods in multi-task learning are typically constrained to a single robot and workspace, while recent work such as RT-X requires a non-trivial action normalization procedure to manually bridge the gap between different action spaces in diverse environments. In this paper, we propose the visual kinematics chain as a precise and universal representation of quasi-static actions for robot learning over diverse environments, which requires no manual adjustment since the visual kinematic chains can be automatically obtained from the robot's model and camera parameters. We propose the Visual Kinematics Transformer (VKT), a convolution-free architecture that supports an arbitrary number of camera viewpoints, and that is trained with a single objective of forecasting kinematic structures through optimal point-set matching. We demonstrate the superior performance of VKT over BC transformers as a general agent on Calvin, RLBench, Open-X, and real robot manipulation tasks. Video demonstrations can be found at https://mlzxy.github.io/visual-kinetic-chain.
comment: CoRL 2024
SegGrasp: Zero-Shot Task-Oriented Grasping via Semantic and Geometric Guided Segmentation
Task-oriented grasping, which involves grasping specific parts of objects based on their functions, is crucial for developing advanced robotic systems capable of performing complex tasks in dynamic environments. In this paper, we propose a training-free framework that incorporates both semantic and geometric priors for zero-shot task-oriented grasp generation. The proposed framework, SegGrasp, first leverages the vision-language models like GLIP for coarse segmentation. It then uses detailed geometric information from convex decomposition to improve segmentation quality through a fusion policy named GeoFusion. An effective grasp pose can be generated by a grasping network with improved segmentation. We conducted the experiments on both segmentation benchmark and real-world robot grasping. The experimental results show that SegGrasp surpasses the baseline by more than 15\% in grasp and segmentation performance.
comment: 7pages,6 figures
Diffusion-based learning of contact plans for agile locomotion
Legged robots have become capable of performing highly dynamic maneuvers in the past few years. However, agile locomotion in highly constrained environments such as stepping stones is still a challenge. In this paper, we propose a combination of model-based control, search, and learning to design efficient control policies for agile locomotion on stepping stones. In our framework, we use nonlinear model predictive control (NMPC) to generate whole-body motions for a given contact plan. To efficiently search for an optimal contact plan, we propose to use Monte Carlo tree search (MCTS). While the combination of MCTS and NMPC can quickly find a feasible plan for a given environment (a few seconds), it is not yet suitable to be used as a reactive policy. Hence, we generate a dataset for optimal goal-conditioned policy for a given scene and learn it through supervised learning. In particular, we leverage the power of diffusion models in handling multi-modality in the dataset. We test our proposed framework on a scenario where our quadruped robot Solo12 successfully jumps to different goals in a highly constrained environment.
DCNet: A Data-Driven Framework for DVL Calibration
Autonomous underwater vehicles (AUVs) are underwater robotic platforms used in a variety of applications. An AUV's navigation solution relies heavily on the fusion of inertial sensors and Doppler velocity logs (DVL), where the latter delivers accurate velocity updates. To ensure accurate navigation, a DVL calibration is undertaken before the mission begins to estimate its error terms. During calibration, the AUV follows a complex trajectory and employs nonlinear estimation filters to estimate error terms. In this paper, we introduce DCNet, a data-driven framework that utilizes a two-dimensional convolution kernel in an innovative way. Using DCNet and our proposed DVL error model, we offer a rapid calibration procedure. This can be applied to a trajectory with a nearly constant velocity. To train and test our proposed approach a dataset of 276 minutes long with real DVL recorded measurements was used. We demonstrated an average improvement of 70% in accuracy and 80% improvement in calibration time, compared to the baseline approach, with a low-performance DVL. As a result of those improvements, an AUV employing a low-cost DVL, can achieve higher accuracy, shorter calibration time, and apply a simple nearly constant velocity calibration trajectory. Our results also open up new applications for marine robotics utilizing low-cost, high-accurate DVLs.
comment: 10 Pages, 9 Figures, 5 Tables
Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective
Reinforcement Learning (RL) has recently achieved remarkable success in robotic control. However, most works in RL operate in simulated environments where privileged knowledge (e.g., dynamics, surroundings, terrains) is readily available. Conversely, in real-world scenarios, robot agents usually rely solely on local states (e.g., proprioceptive feedback of robot joints) to select actions, leading to a significant sim-to-real gap. Existing methods address this gap by either gradually reducing the reliance on privileged knowledge or performing a two-stage policy imitation. However, we argue that these methods are limited in their ability to fully leverage the available privileged knowledge, resulting in suboptimal performance. In this paper, we formulate the sim-to-real gap as an information bottleneck problem and therefore propose a novel privileged knowledge distillation method called the Historical Information Bottleneck (HIB). In particular, HIB learns a privileged knowledge representation from historical trajectories by capturing the underlying changeable dynamic information. Theoretical analysis shows that the learned privileged knowledge representation helps reduce the value discrepancy between the oracle and learned policies. Empirical experiments on both simulated and real-world tasks demonstrate that HIB yields improved generalizability compared to previous methods. Videos of real-world experiments are available at https://sites.google.com/view/history-ib .
comment: Accepted by CoRL 2024
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at https://transic-robot.github.io/
comment: 8th Conference on Robot Learning (CoRL 2024), Munich, Germany. Project website: https://transic-robot.github.io/
Twisting Lids Off with Two Hands
Manipulating objects with two multi-fingered hands has been a long-standing challenge in robotics, due to the contact-rich nature of many manipulation tasks and the complexity inherent in coordinating a high-dimensional bimanual system. In this work, we share novel insights into physical modeling, real-time perception, and reward design that enable policies trained in simulation using deep reinforcement learning (RL) to be effectively and efficiently transferred to the real world. Specifically, we consider the problem of twisting lids of various bottle-like objects with two hands, demonstrating policies with generalization capabilities across a diverse set of unseen objects as well as dynamic and dexterous behaviors. To the best of our knowledge, this is the first sim-to-real RL system that enables such capabilities on bimanual multi-fingered hands.
comment: Project page can be found at https://toruowo.github.io/bimanual-twist
Fusion-Driven Tree Reconstruction and Fruit Localization: Advancing Precision in Agriculture IROS
Fruit distribution is pivotal in shaping the future of both agriculture and agricultural robotics, paving the way for a streamlined supply chain. This study introduces an innovative methodology that harnesses the synergy of RGB imagery, LiDAR, and IMU data, to achieve intricate tree reconstructions and the pinpoint localization of fruits. Such integration not only offers insights into the fruit distribution, which enhances the precision of guidance for agricultural robotics and automation systems, but also sets the stage for simulating synthetic fruit patterns across varied tree architectures. To validate this approach, experiments have been carried out in both a controlled environment and an actual peach orchard. The results underscore the robustness and efficacy of this fusion-driven methodology, highlighting its potential as a transformative tool for future agricultural robotics and precision farming.
comment: This work was presented at IEEE/RSI International Conference on Intelligent Robots and Systems (IROS) Workshop
Learning Granular Media Avalanche Behavior for Indirectly Manipulating Obstacles on a Granular Slope
Legged robot locomotion on sand slopes is challenging due to the complex dynamics of granular media and how the lack of solid surfaces can hinder locomotion. A promising strategy, inspired by ghost crabs and other organisms in nature, is to strategically interact with rocks, debris, and other obstacles to facilitate movement. To provide legged robots with this ability, we present a novel approach that leverages avalanche dynamics to indirectly manipulate objects on a granular slope. We use a Vision Transformer (ViT) to process image representations of granular dynamics and robot excavation actions. The ViT predicts object movement, which we use to determine which leg excavation action to execute. We collect training data from 100 real physical trials and, at test time, deploy our trained model in novel settings. Experimental results suggest that our model can accurately predict object movements and achieve a success rate $\geq 80\%$ in a variety of manipulation tasks with up to four obstacles, and can also generalize to objects with different physics properties. To our knowledge, this is the first paper to leverage granular media avalanche dynamics to indirectly manipulate objects on granular slopes. Supplementary material is available at https://sites.google.com/view/grain-corl2024/home.
comment: Accepted to CoRL 2024
Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation
Vision-and-language navigation (VLN) enables the agent to navigate to a remote location in 3D environments following the natural language instruction. In this field, the agent is usually trained and evaluated in the navigation simulators, lacking effective approaches for sim-to-real transfer. The VLN agents with only a monocular camera exhibit extremely limited performance, while the mainstream VLN models trained with panoramic observation, perform better but are difficult to deploy on most monocular robots. For this case, we propose a sim-to-real transfer approach to endow the monocular robots with panoramic traversability perception and panoramic semantic understanding, thus smoothly transferring the high-performance panoramic VLN models to the common monocular robots. In this work, the semantic traversable map is proposed to predict agent-centric navigable waypoints, and the novel view representations of these navigable waypoints are predicted through the 3D feature fields. These methods broaden the limited field of view of the monocular robots and significantly improve navigation performance in the real world. Our VLN system outperforms previous SOTA monocular VLN methods in R2R-CE and RxR-CE benchmarks within the simulation environments and is also validated in real-world environments, providing a practical and high-performance solution for real-world VLN.
comment: Accepted by CoRL 2024. The code is available at https://github.com/MrZihan/Sim2Real-VLN-3DFF
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability
Recent advancements in Large Language Models (LLMs) have showcased their ability to perform complex reasoning tasks, but their effectiveness in planning remains underexplored. In this study, we evaluate the planning capabilities of OpenAI's o1 models across a variety of benchmark tasks, focusing on three key aspects: feasibility, optimality, and generalizability. Through empirical evaluations on constraint-heavy tasks (e.g., $\textit{Barman}$, $\textit{Tyreworld}$) and spatially complex environments (e.g., $\textit{Termes}$, $\textit{Floortile}$), we highlight o1-preview's strengths in self-evaluation and constraint-following, while also identifying bottlenecks in decision-making and memory management, particularly in tasks requiring robust spatial reasoning. Our results reveal that o1-preview outperforms GPT-4 in adhering to task constraints and managing state transitions in structured environments. However, the model often generates suboptimal solutions with redundant actions and struggles to generalize effectively in spatially complex tasks. This pilot study provides foundational insights into the planning limitations of LLMs, offering key directions for future research on improving memory management, decision-making, and generalization in LLM-based planning. Code available at https://github.com/VITA-Group/o1-planning.
comment: Code available at https://github.com/VITA-Group/o1-planning
E2H: A Two-Stage Non-Invasive Neural Signal Driven Humanoid Robotic Whole-Body Control Framework
Recent advancements in humanoid robotics, including the integration of hierarchical reinforcement learning-based control and the utilization of LLM planning, have significantly enhanced the ability of robots to perform complex tasks. In contrast to the highly developed humanoid robots, the human factors involved remain relatively unexplored. Directly controlling humanoid robots with the brain has already appeared in many science fiction novels, such as Pacific Rim and Gundam. In this work, we present E2H (EEG-to-Humanoid), an innovative framework that pioneers the control of humanoid robots using high-frequency non-invasive neural signals. As the none-invasive signal quality remains low in decoding precise spatial trajectory, we decompose the E2H framework in an innovative two-stage formation: 1) decoding neural signals (EEG) into semantic motion keywords, 2) utilizing LLM facilitated motion generation with a precise motion imitation control policy to realize humanoid robotics control. The method of directly driving robots with brainwave commands offers a novel approach to human-machine collaboration, especially in situations where verbal commands are impractical, such as in cases of speech impairments, space exploration, or underwater exploration, unlocking significant potential. E2H offers an exciting glimpse into the future, holding immense potential for human-computer interaction.
Object Importance Estimation using Counterfactual Reasoning for Intelligent Driving
The ability to identify important objects in a complex and dynamic driving environment is essential for autonomous driving agents to make safe and efficient driving decisions. It also helps assistive driving systems decide when to alert drivers. We tackle object importance estimation in a data-driven fashion and introduce HOIST - Human-annotated Object Importance in Simulated Traffic. HOIST contains driving scenarios with human-annotated importance labels for vehicles and pedestrians. We additionally propose a novel approach that relies on counterfactual reasoning to estimate an object's importance. We generate counterfactual scenarios by modifying the motion of objects and ascribe importance based on how the modifications affect the ego vehicle's driving. Our approach outperforms strong baselines for the task of object importance estimation on HOIST. We also perform ablation studies to justify our design choices and show the significance of the different components of our proposed approach.
CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning
Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However, agents replayed from offline data are not reactive and hard to intuitively control. Existing approaches address these challenges by proposing methods that rely on heuristics or generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning (RL) to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through a physics-enhanced Nocturne simulator to generate a diverse offline RL dataset, annotated with various rewards. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including adversarial behaviours. We show that CtRL-Sim can generate realistic safety-critical scenarios while providing fine-grained control over agent behaviours.
comment: CoRL 2024
ShieldNN: A Provably Safe NN Filter for Unsafe NN Controllers
In this paper, we develop a novel closed-form Control Barrier Function (CBF) and associated controller shield for the Kinematic Bicycle Model (KBM) with respect to obstacle avoidance. The proposed CBF and shield -- designed by an algorithm we call ShieldNN -- provide two crucial advantages over existing methodologies. First, ShieldNN considers steering and velocity constraints directly with the non-affine KBM dynamics; this is in contrast to more general methods, which typically consider only affine dynamics and do not guarantee invariance properties under control constraints. Second, ShieldNN provides a closed-form set of safe controls for each state unlike more general methods, which typically rely on optimization algorithms to generate a single instantaneous for each state. Together, these advantages make ShieldNN uniquely suited as an efficient Multi-Obstacle Safe Actions (i.e. multiple-barrier-function shielding) during training time of a Reinforcement Learning (RL) enabled NN controller. We show via experiments that ShieldNN dramatically increases the completion rate of RL training episodes in the presence of multiple obstacles, thus establishing the value of ShieldNN in training RL-based controllers.
Multiagent Systems
STACKFEED: Structured Textual Actor-Critic Knowledge Base Editing with FeedBack
Large Language Models (LLMs) often generate incorrect or outdated information, especially in low-resource settings or when dealing with private data. To address this, Retrieval-Augmented Generation (RAG) uses external knowledge bases (KBs), but these can also suffer from inaccuracies. We introduce STACKFEED, a novel Structured Textual Actor-Critic Knowledge base editing with FEEDback approach that iteratively refines the KB based on expert feedback using a multi-actor, centralized critic reinforcement learning framework. Each document is assigned to an actor, modeled as a ReACT agent, which performs structured edits based on document-specific targeted instructions from a centralized critic. Experimental results show that STACKFEED significantly improves KB quality and RAG system performance, enhancing accuracy by up to 8% over baselines.
Content Caching-Assisted Vehicular Edge Computing Using Multi-Agent Graph Attention Reinforcement Learning
In order to avoid repeated task offloading and realize the reuse of popular task computing results, we construct a novel content caching-assisted vehicular edge computing (VEC) framework. In the face of irregular network topology and unknown environmental dynamics, we further propose a multi-agent graph attention reinforcement learning (MGARL) based edge caching scheme, which utilizes the graph attention convolution kernel to integrate the neighboring nodes' features of each agent and further enhance the cooperation among agents. Our simulation results show that our proposed scheme is capable of improving the utilization of caching resources while reducing the long-term task computing latency compared to the baselines.
comment: 6 pages, 5 figures
Tax Credits and Household Behavior: The Roles of Myopic Decision-Making and Liquidity in a Simulated Economy
There has been a growing interest in multi-agent simulators in the domain of economic modeling. However, contemporary research often involves developing reinforcement learning (RL) based models that focus solely on a single type of agents, such as households, firms, or the government. Such an approach overlooks the adaptation of interacting agents thereby failing to capture the complexity of real-world economic systems. In this work, we consider a multi-agent simulator comprised of RL agents of numerous types, including heterogeneous households, firm, central bank and government. In particular, we focus on the crucial role of the government in distributing tax credits to households. We conduct two broad categories of comprehensive experiments dealing with the impact of tax credits on 1) households with varied degrees of myopia (short-sightedness in spending and saving decisions), and 2) households with diverse liquidity profiles. The first category of experiments examines the impact of the frequency of tax credits (e.g. annual vs quarterly) on consumption patterns of myopic households. The second category of experiments focuses on the impact of varying tax credit distribution strategies on households with differing liquidities. We validate our simulation model by reproducing trends observed in real households upon receipt of unforeseen, uniform tax credits, as documented in a JPMorgan Chase report. Based on the results of the latter, we propose an innovative tax credit distribution strategy for the government to reduce inequality among households. We demonstrate the efficacy of this strategy in improving social welfare in our simulation results.
Compressed Federated Reinforcement Learning with a Generative Model ECML-PKDD 2024
Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs. In this paper, we propose CompFedRL, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-$K$ and Sparsified-$K$ sparsification operators.
comment: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024)
Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study AAMAS-2024
Norms are an important component of the social fabric of society by prescribing expected behaviour. In Multi-Agent Systems (MAS), agents interacting within a society are equipped to possess social capabilities such as reasoning about norms and trust. Norms have long been of interest within the Normative Multi-Agent Systems community with researchers studying topics such as norm emergence, norm violation detection and sanctioning. However, these studies have some limitations: they are often limited to simple domains, norms have been represented using a variety of representations with no standard approach emerging, and the symbolic reasoning mechanisms generally used may suffer from a lack of extensibility and robustness. In contrast, Large Language Models (LLMs) offer opportunities to discover and reason about norms across a large range of social situations. This paper evaluates the capability of LLMs to detecting norm violations. Based on simulated data from 80 stories in a household context, with varying complexities, we investigated whether 10 norms are violated. For our evaluations we first obtained the ground truth from three human evaluators for each story. Then, the majority result was compared against the results from three well-known LLM models (Llama 2 7B, Mixtral 7B and ChatGPT-4). Our results show the promise of ChatGPT-4 for detecting norm violations, with Mixtral some distance behind. Also, we identify areas where these models perform poorly and discuss implications for future work.
comment: To appear in COINE@AAMAS-2024 Springer LNCS post-proceedings
Systems and Control (CS)
A System Parameterization for Direct Data-Driven Estimator Synthesis
This paper introduces a novel parameterization to characterize unknown linear time-invariant systems using noisy data. The presented parameterization describes exactly the set of all systems consistent with the available data. We then derive verifiable conditions, when the consistency constraint reduces the set to the true system and when it does not have any impact. Furthermore, we demonstrate how to use this parameterization to perform a direct data-driven estimator synthesis with guarantees on the H_{\infty}-norm. Lastly, we conduct numerical experiments to compare our approach to existing methods.
comment: This work has been submitted to the American Control Conference 2025
Mindalogue: LLM -- Powered Nonlinear Interaction for Effective Learning and Task Exploration
Current generative AI models like ChatGPT, Claude, and Gemini are widely used for knowledge dissemination, task decomposition, and creative thinking. However, their linear interaction methods often force users to repeatedly compare and copy contextual information when handling complex tasks, increasing cognitive load and operational costs. Moreover, the ambiguity in model responses requires users to refine and simplify the information further. To address these issues, we developed "Mindalogue", a system using a non-linear interaction model based on "nodes + canvas" to enhance user efficiency and freedom while generating structured responses. A formative study with 11 users informed the design of Mindalogue, which was then evaluated through a study with 16 participants. The results showed that Mindalogue significantly reduced task steps and improved users' comprehension of complex information. This study highlights the potential of non-linear interaction in improving AI tool efficiency and user experience in the HCI field.
comment: 17 pages, 9 figures. Submitted to CHI 2025
Reflexive Input-Output Causality Mechanisms
This paper explores the concept of reflexive actuation, examining how robots may leverage both internal and external stimuli to trigger changes in the motion, performance, or physical characteristics of the robot, such as its size, shape, or configuration, and so on. These changes themselves may in turn be sequentially re-used as input to drive further adaptations. Drawing inspiration from biological systems, where reflexes are an essential component of the response to environmental changes, reflexive actuation is critical to enable robots to adapt to diverse situations and perform complex tasks. The underlying principles of reflexive actuation are analyzed, with examples provided from existing implementations such as contact-sensitive reflexive arms, physical counters, and their applications. The paper also outlines future directions and challenges for advancing this research area, emphasizing its significance in the development of adaptive, responsive robotic systems.
comment: 9 pages, 5 figures
Consensus in Multiagent Systems with lack of connection
We consider multi-agent systems with cooperative interactions and study the convergence to consensus in the case of time-dependent lack of interaction. We prove a new condition ensuring consensus: we define a graph in which directed arrows correspond to connection functions that converge (in the weak sense) to some function with a positive integral on all intervals of the form $[t,+\infty)$. If the graph has a vertex reachable from all other indices, then the system converges to consensus. We show that this requirement generalizes some known sufficient conditions for convergence, such as the Persistent Excitation one. We also give a second new condition, transversal to the known ones: total connectedness of the undirected graph formed by the non-vanishing of limiting functions.
Robust co-design framework for buildings operated by predictive control
Cost-effective decarbonisation of the built environment is a stepping stone to achieving net-zero carbon emissions since buildings are globally responsible for more than a quarter of global energy-related CO$_2$ emissions. Improving energy utilization and decreasing costs naturally requires considering multiple domain-specific performance criteria. The resulting problem is often computationally infeasible. The paper proposes an approach based on decomposition and selection of significant operating conditions to achieve a formulation with reduced computational complexity. We present a robust framework to optimise the physical design, the controller, and the operation of residential buildings in an integrated fashion, considering external weather conditions and time-varying electricity prices. The framework explicitly includes operational constraints and increases the utilization of the energy generated by intermittent resources. A case study illustrates the potential of co-design in enhancing the reliability, flexibility and self-sufficiency of a system operating under different conditions. Specifically, numerical results demonstrate reductions in costs up to $30$% compared to a deterministic formulation. Furthermore, the proposed approach achieves a computational time reduction of at least $10$ times lower compared to the original problem with a deterioration in the performance of only 0.6%.
Towards more realistic co-simulation of cyber-physical energy distribution systems
The increased integration of information and communications technology at the distribution grid level offers broader opportunities for active operational management concepts. At the same time, requirements for resilience against internal and external threats to the power supply, such as outages or cyberattacks, are increasing. The emerging threat landscape needs to be investigated to ensure the security of supply of future distribution grids. This extended abstract presents a co-simulation environment to study communication infrastructures for the resilient operation of distribution grids. For this purpose, a communication network emulation and a power grid simulation are combined in a common modular environment. This will provide the basis for cybersecurity investigations and testing of new active operation management concepts for smart grids. Exemplary laboratory tests and attack replications will be used to demonstrate the diverse use cases of our co-simulation approach.
comment: Published at: IFAC Conference on Networked Systems 2022
Cooperative nonlinear distributed model predictive control with dissimilar control horizons
In this paper, we introduce a nonlinear distributed model predictive control (DMPC) algorithm, which allows for dissimilar and time-varying control horizons among agents, thereby addressing a common limitation in current DMPC schemes. We consider cooperative agents with varying computational capabilities and operational objectives, each willing to manage varying numbers of optimization variables at each time step. Recursive feasibility and a non-increasing evolution of the optimal cost are proven for the proposed algorithm. Through numerical simulations on systems with three agents, we show that our approach effectively approximates the performance of traditional DMPC, while reducing the number of variables to be optimized. This advancement paves the way for a more decentralized yet coordinated control strategy in various applications, including power systems and traffic management.
comment: 6 pages
Coupled autoregressive active inference agents for control of multi-joint dynamical systems
We propose an active inference agent to identify and control a mechanical system with multiple bodies connected by joints. This agent is constructed from multiple scalar autoregressive model-based agents, coupled together by virtue of sharing memories. Each subagent infers parameters through Bayesian filtering and controls by minimizing expected free energy over a finite time horizon. We demonstrate that a coupled agent of this kind is able to learn the dynamics of a double mass-spring-damper system, and drive it to a desired position through a balance of explorative and exploitative actions. It outperforms the uncoupled subagents in terms of surprise and goal alignment.
comment: 14 pages, 3 figures, accepted to the International Workshop on Active Inference 2024
Efficiently Obtaining Reachset Conformance for the Formal Analysis of Robotic Contact Tasks IROS 2024
Formal verification of robotic tasks requires a simple yet conformant model of the used robot. We present the first work on generating reachset conformant models for robotic contact tasks considering hybrid (mixed continuous and discrete) dynamics. Reachset conformance requires that the set of reachable outputs of the abstract model encloses all previous measurements to transfer safety properties. Aiming for industrial applications, we describe the system using a simple hybrid automaton with linear dynamics. We inject non-determinism into the continuous dynamics and the discrete transitions, and we optimally identify all model parameters together with the non-determinism required to capture the recorded behaviors. Using two 3-DOF robots, we show that our approach can effectively generate models to capture uncertainties in system behavior and substantially reduce the required testing effort in industrial applications.
comment: Accepted at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Robust Tracking Control with Neural Network Dynamic Models under Input Perturbations
Robust control problem has significant practical implication since external disturbances can significantly impact the performance of control method. Existing robust control method excels at control-affine system but fails at neural network dynamic models. Developing robust control methods for such systems remains a complex challenge. In this paper, we focus on robust tracking method for neural network dynamic models. We first propose reachability analysis tool designed for this system and then introduce how to reformulate robust tracking problem with the reachable sets. In addition, we prove the existence of feedback policy that bounds the growth of reachable set over infinite horizon. The effectiveness of proposed approach is validated through numerical tracking task simulations, where we compare it with a standard tube MPC method.
comment: 8 pages, 8 figures, conference
Sequential drone routing for data assimilation on a 2D airborne contaminant dispersion problem
The combined use of data from different sources can be critical in emergencies, where accurate models are needed to make real-time decisions, but high-fidelity representations and detailed information are simply unavailable. This study presents a data assimilation framework based on an ensemble Kalman filter that sequentially exploits and improves an advection-diffusion model in a case study concerning an airborne contaminant dispersion problem over a complex two-dimensional domain. An autonomous aerial drone is used to sequentially observe the actual contaminant concentration in a small fraction of the domain, orders of magnitude smaller than the total domain area. Such observations are synchronized with the data assimilation framework, iteratively adjusting the simulation. The path of the drone is sequentially optimized by balancing exploration and exploitation according to the available knowledge at each decision time. Starting from an erroneous initial model based on approximated assumptions that represent the limited initial knowledge available during emergency scenarios, results show how the proposed framework sequentially improves its belief about the dispersion dynamics, thus providing a reliable contaminant concentration map.
Detection of High-Impedance Low-Current Arc Faults at Electrical Substations
Arcing faults in low voltage (LV) distribution systems associated with arc-flash risk and potentially significant equipment damage are notoriously difficult to detect under some conditions. Especially so when attempting to detect using sensing at the line, high voltage side of a substation transformer. This paper presents an analytics-based physics-aware approach to detect high-impedance, low-current arcing faults from the primary side of the substation transformer at current thresholds, below normal operating events, along with transformer inrush currents. The proposed methodology leverages the Hankel Alternative View Of Koopman Operator approach to differentiate arcing faults from standard operations, while the Series2Graph method is employed to identify the time of fault occurrence and duration. Unlike prior studies that detect such faults at the device or secondary transformer side, this work demonstrates successful fault detection at the primary side of the distribution substation transformer for faults occurring on the secondary side. The approach addresses the practical challenges of differentiating primary side expected and acceptable transients from similar magnitude LV arcing fault currents that may occur on the secondary side. The results demonstrate the efficacy of the proposed method in accurately identifying fault occurrence and duration, minimizing the risk of false positives during similar characteristic events, thus improving the reliability and operational efficiency of power distribution systems. This approach can benefit both traditional and smart power grids that employ similar transformer configurations.
Analysis of Wind Power Integration in Electricity Markets LMP Pricing
Wind energy has emerged as one of the most vital and economically viable forms of renewable energy. The integration of wind energy sources into power grids across the globe has been increasing substantially, largely due to the higher levels of uncertainty associated with wind energy compared to other renewable energy sources. This study focuses on analyzing the Locational Marginal Pricing (LMP) market model, with particular emphasis on the integration of wind power plants into substations. Furthermore, it examines a two-stage stochastic model for electricity markets employing LMP pricing, utilizing the Optimal Power Flow (OPF) method for the analysis.
Recursively Feasible Stochastic Model Predictive Control for Time-Varying Linear Systems Subject to Unbounded Disturbances
Model predictive control solves a constrained optimization problem online in order to compute an implicit closed-loop control policy. Recursive feasibility -- guaranteeing that the optimal control problem will have a solution at every time step -- is an important property to guarantee the success of any model predictive control approach. However, recursive feasibility is difficult to establish in a stochastic setting and, in particular, in the presence of disturbances having unbounded support (e.g., Gaussian noise). The problem is further exacerbated for time-varying systems, in which case recursive feasibility must be established also in a robust sense, over all possible future time-varying parameter values, as well as in a stochastic sense, over all potential disturbance realizations. This work presents a method for ensuring the recursive feasibility of a convex, affine-feedback stochastic model predictive control problem formulation for systems with time-varying system matrices and unbounded disturbances using ideas from covariance steering stochastic model predictive control. It is additionally shown that the proposed approach ensures the closed-loop operation of the system will satisfy the desired chance constraints in practice, and that the stochastic model predictive control problem may be formulated as a convex program so that it may be efficiently solved in real-time.
Multi-Objective Multidisciplinary Optimization of Wave Energy Converter Array Layout and Controls
This study utilizes multidisciplinary design optimization (MDO) to design an array of heaving wave energy converters (WECs) for grid-scale energy production with decision variables and parameters chosen from the coupled disciplines of geometry, hydrodynamics, layout, motor-actuated reactive controls (with a force maximum constraint) and economics. We vary a WEC's dimensions, array layout, and control gain to minimize two objectives: the levelized cost of energy (LCOE) and the maximum separation distance. This multi-objective optimization approach results in a set of optimal design configurations that stakeholders can choose from for their specific application and needs. The framework yields a range of optimal (minimum) LCOE values from 0.21 to 0.23 \$/kWh and a separation distance ranging from 97 to 62 meters. The WEC radius of 4m is found to be optimal, and the q-factor for optimal designs are greater than 1 up to 1.06 for a rhombus-like layout. Additionally, a post-optimality global sensitivity analysis of a design shows that wave heading, wave frequency, WEC lifetime, amplitude and interest rate accounts for most of the variance. Different designs in the Pareto set may be appealing for different decision makers based on their trade-off analysis. To that end, regression model is developed for design heuristics.
A Structural Analysis of the User Behavior Dynamics for Environmentally Sustainable ICT
The sector of information and communication technology (ICT) can contribute to the fulfillment of the Paris agreement and the sustainable development goals (SDGs) through the introduction of sustainability strategies. For environmental sustainability, such strategies should contain efficiency, sufficiency, and consistency measures. To propose such, a structural analysis of ICT is undertaken in this manuscript. Thereby, key mechanisms and dynamics behind the usage of ICT and the corresponding energy and resource use are analyzed by describing ICT as a complex system. The system contains data centers, communication networks, smartphone hardware, apps, and the behavior of the users as sub-systems, between which various Morinian interactions are present. Energy and non-energy resources can be seen as inputs of the system, while e-waste is an output. Based on the system description, we propose multiple measures for efficiency, sufficiency and consistency to reduce greenhouse gas emissions and other environmental impacts.
Automated Discovery of Continuous Dynamics from Videos
Dynamical systems form the foundation of scientific discovery, traditionally modeled with predefined state variables such as the angle and angular velocity, and differential equations such as the equation of motion for a single pendulum. We propose an approach to discover a set of state variables that preserve the smoothness of the system dynamics and to construct a vector field representing the system's dynamics equation, automatically from video streams without prior physical knowledge. The prominence and effectiveness of the proposed approach are demonstrated through both quantitative and qualitative analyses of various dynamical systems, including the prediction of characteristic frequencies and the identification of chaotic and limit cycle behaviors. This shows the potential of our approach to assist human scientists in scientific discovery.
AI-Driven Autonomous Control of Proton-Boron Fusion Reactors Using Backpropagation Neural Networks
Proton-boron (p-11B) fusion presents a promising path towards sustainable, neutron-free energy generation. However, its implementation is hindered by extreme operational conditions, such as plasma temperatures exceeding billions of degrees and the complexity of controlling high-energy particles. Traditional control systems face significant challenges in managing the highly dynamic and non-linear behavior of the plasma. In this paper, we propose a novel approach utilizing backpropagation-based neural networks to autonomously control key parameters in a proton-boron fusion reactor. Our method leverages real-time feedback and learning from physical data to adapt to changing plasma conditions, offering a potential breakthrough in stable and efficient p-11B fusion. Furthermore, we expand on the scalability and generalization of our approach to other fusion systems and future AI technologies.
Constrained Trajectory Optimization on Matrix Lie Groups via Lie-Algebraic Differential Dynamic Programming
Matrix Lie groups are an important class of manifolds commonly used in control and robotics, and optimizing control policies on these manifolds is a fundamental problem. In this work, we propose a novel computationally efficient approach for trajectory optimization on matrix Lie groups using an augmented Lagrangian-based constrained discrete Differential Dynamic Programming (DDP). The method involves lifting the optimization problem to the Lie algebra during the backward pass and retracting back to the manifold during the forward pass. Unlike previous approaches that addressed constraint handling only for specific classes of matrix Lie groups, the proposed method provides a general solution for nonlinear constraint handling across generic matrix Lie groups. We evaluate the effectiveness of the proposed DDP method in handling constraints within a mechanical system characterized by rigid body dynamics in SE(3), assessing its computational efficiency compared to existing direct optimization solvers. Additionally, the method demonstrates robustness under external disturbances when applied as a Lie-algebraic feedback control policy on SE(3), and in optimizing a quadrotor's trajectory in a challenging realistic scenario. Experiments show that the proposed approach effectively manages general constraints defined on configuration, velocity, and inputs during optimization, while also maintaining stability under external disturbances when executing the resultant control policy in closed-loop.
comment: 12 pages, 6 figures
Transformer Temperature Management and Voltage Control in Electric Distribution Systems with High Solar PV Penetration
The increasing penetration of photovoltaic (PV) systems in distribution grids can lead to overvoltage and transformer overloading issues. While voltage regulation has been extensively studied and some research has addressed transformer temperature control, there is limited work on simultaneously managing both challenges. This paper addresses this gap by proposing an optimization-based strategy that efficiently manages voltage regulation and transformer temperature while minimizing the curtailment of PV generation. In order to make this problem convex, a relaxation is applied to the transformer temperature dynamics constraint. We also provide analysis to determine under which conditions this relaxation remains tight. The proposed approach is validated through simulations, demonstrating its effectiveness in achieving the desired control objectives.
Quantum feedback control of a two-atom network closed by a semi-infinite waveguide
The purpose of this paper is to study the delay-dependent coherent feedback dynamics by focusing on one typical realization, i.e., a two-atom quantum network whose feedback loop is closed by a semi-infinite waveguide. In this set-up, an initially excited two-level atom can emit a photon into the waveguide, where the propagating photon can be reflected by the terminal mirror of the waveguide or absorbed by the other atom, thus constructing various coherent feedback loops. We show that there can be two-photon, one-photon or zero-photon states in the waveguide, which can be controlled by the feedback loop length and the coupling strengths between the atoms and waveguide. The photonic states in the waveguide are analyzed in both the frequency domain and the spatial domain, and the transient process of photon emissions is better understood based on a comprehensive analysis using both domains. Interestingly, we clarify that this quantum coherent feedback network can be mathematically modeled as a linear control system with multiple delays, which are determined by the distances between atoms and the terminal mirror of the semi-infinite waveguide. Therefore, based on time-delayed linear control system theory, the influence of delays on the stability of the quantum state evolution and the steady-state atomic and photonic states is investigated, for both small and large delays.
The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies
We study mathematical models of binary direct collinear collisions of convex viscoplastic bodies based on two incremental collision laws that employ the Bouc-Wen differential model of hysteresis to represent the elastoplastic behavior of the materials of the colliding bodies. These collision laws are the Bouc-Wen-Simon-Hunt-Crossley collision law (BWSHCCL) and the Bouc-Wen-Maxwell collision law (BWMCL). The BWSHCCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in parallel to a nonlinear displacement-dependent and rate-dependent energy dissipation element. The BWMCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in series to a linear rate-dependent energy dissipation element. The mathematical models of the collision process are presented in the form of finite-dimensional initial value problems. We show that the models possess favorable analytical properties (e.g., global existence, uniqueness and boundedness of the solutions) under suitable restrictions on the ranges of their parameters. Furthermore, we show that excellent agreement can be achieved between the experimental data and the data from the numerical simulation of the mathematical models across a wide range of initial relative velocities and material properties of the colliding bodies while using parameterizations that are independent of the initial relative velocity.
comment: 15 pages; 4 figures; the associated code/data are available from https://gitlab.com/user9716869/BWBCL; added references and corrected typos
Decompositions of Nonlinear Input-Output Systems to Zero the Output
Consider an input-output system where the output is the tracking error given some desired reference signal. It is natural to consider under what conditions the problem has an exact solution, that is, the tracking error is exactly the zero function. If the system has a well defined relative degree and the zero function is in the range of the input-output map, then it is well known that the system is locally left invertible, and thus, the problem has a unique exact solution. A system will fail to have relative degree when more than one exact solution exists. The general goal of this paper is to describe a decomposition of an input-output system having a Chen-Fliess series representation into a parallel product of subsystems in order to identify possible solutions to the problem of zeroing the output. For computational purposes, the focus is on systems whose generating series are polynomials. It is shown that the shuffle algebra on the set of generating polynomials is a unique factorization domain so that any polynomial can be uniquely factored modulo a permutation into its irreducible elements for the purpose of identifying the subsystems in a parallel product decomposition. This is achieved using the fact that this shuffle algebra is isomorphic to the symmetric algebra over the vector space spanned by Lyndon words. A specific algorithm for factoring generating polynomials into its irreducible factors is presented based on the Chen-Fox-Lyndon factorization of words.
comment: Final version with revised title and abstract
Intent Demonstration in General-Sum Dynamic Games via Iterative Linear-Quadratic Approximations
Autonomous agents should be able to coordinate with other agents without knowing their intents ahead of time. While prior work has studied how agents can gather information about the intent of others, in this work, we study the inverse problem: how agents can demonstrate their intent to others, within the framework of general-sum dynamic games. We first present a model of this intent demonstration problem and then propose an algorithm that enables an agent to trade off their task performance and intent demonstration to improve the overall system's performance. To scale to continuous states and action spaces as well as to nonlinear dynamics and costs, our algorithm leverages linear-quadratic approximations with an efficient intent teaching guarantee. Our empirical results show that intent demonstration accelerates other agents' learning and enables the demonstrating agent to balance task performance with intent expression.
The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies
We study mathematical models of binary direct collinear collisions of convex viscoplastic bodies based on two incremental collision laws that employ the Bouc-Wen differential model of hysteresis to represent the elastoplastic behavior of the materials of the colliding bodies. These collision laws are the Bouc-Wen-Simon-Hunt-Crossley collision law (BWSHCCL) and the Bouc-Wen-Maxwell collision law (BWMCL). The BWSHCCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in parallel to a nonlinear displacement-dependent and rate-dependent energy dissipation element. The BWMCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in series to a linear rate-dependent energy dissipation element. The mathematical models of the collision process are presented in the form of finite-dimensional initial value problems. We show that the models possess favorable analytical properties (e.g., global existence, uniqueness and boundedness of the solutions) under suitable restrictions on the ranges of their parameters. Furthermore, we show that excellent agreement can be achieved between the experimental data and the data from the numerical simulation of the mathematical models across a wide range of initial relative velocities and material properties of the colliding bodies while using parameterizations that are independent of the initial relative velocity.
comment: 15 pages; 4 figures; added references and corrected typos; the associated code/data are available from https://gitlab.com/user9716869/BWBCL
Machine Learning Driven Global Optimisation Framework for Analog Circuit Design
We propose a machine learning-driven optimisation framework for analog circuit design in this paper. The primary objective is to determine the device sizes for the optimal performance of analog circuits for a given set of specifications. Our methodology entails employing machine learning models and spice simulations to direct the optimisation algorithm towards achieving the optimal design for analog circuits. Machine learning based global offline surrogate models, with the circuit design parameters as the input, are built in the design space for the analog circuits under study and is used to guide the optimisation algorithm, resulting in faster convergence and a reduced number of spice simulations. Multi-layer perceptron and random forest regressors are employed to predict the required design specifications of the analog circuit. Since the saturation condition of transistors is vital in the proper working of analog circuits, multi-layer perceptron classifiers are used to predict the saturation condition of each transistor in the circuit. The feasibility of the candidate solutions is verified using machine learning models before invoking spice simulations. We validate the proposed framework using three circuit topologies--a bandgap reference, a folded cascode operational amplifier, and a two-stage operational amplifier. The simulation results show better optimum values and lower standard deviations for fitness functions after convergence. Incorporating the machine learning-based predictions proposed in the optimisation method has resulted in the reduction of spice calls by 56%, 59%, and 83% when compared with standard approaches in the three test cases considered in the study.
Solving Offline Reinforcement Learning with Decision Tree Regression
This study presents a novel approach to addressing offline reinforcement learning (RL) problems by reframing them as regression tasks that can be effectively solved using Decision Trees. Mainly, we introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies (RCDTP and RWDTP), both of which achieve notable speed in agent training as well as inference, with training typically lasting less than a few minutes. Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods. We evaluate our methods on D4RL datasets for locomotion and manipulation, as well as other robotic tasks involving wheeled and flying robots. Additionally, we assess performance in delayed/sparse reward scenarios and highlight the explainability of these policies through action distribution and feature importance.
ShieldNN: A Provably Safe NN Filter for Unsafe NN Controllers
In this paper, we develop a novel closed-form Control Barrier Function (CBF) and associated controller shield for the Kinematic Bicycle Model (KBM) with respect to obstacle avoidance. The proposed CBF and shield -- designed by an algorithm we call ShieldNN -- provide two crucial advantages over existing methodologies. First, ShieldNN considers steering and velocity constraints directly with the non-affine KBM dynamics; this is in contrast to more general methods, which typically consider only affine dynamics and do not guarantee invariance properties under control constraints. Second, ShieldNN provides a closed-form set of safe controls for each state unlike more general methods, which typically rely on optimization algorithms to generate a single instantaneous for each state. Together, these advantages make ShieldNN uniquely suited as an efficient Multi-Obstacle Safe Actions (i.e. multiple-barrier-function shielding) during training time of a Reinforcement Learning (RL) enabled NN controller. We show via experiments that ShieldNN dramatically increases the completion rate of RL training episodes in the presence of multiple obstacles, thus establishing the value of ShieldNN in training RL-based controllers.
Systems and Control (EESS)
A System Parameterization for Direct Data-Driven Estimator Synthesis
This paper introduces a novel parameterization to characterize unknown linear time-invariant systems using noisy data. The presented parameterization describes exactly the set of all systems consistent with the available data. We then derive verifiable conditions, when the consistency constraint reduces the set to the true system and when it does not have any impact. Furthermore, we demonstrate how to use this parameterization to perform a direct data-driven estimator synthesis with guarantees on the H_{\infty}-norm. Lastly, we conduct numerical experiments to compare our approach to existing methods.
comment: This work has been submitted to the American Control Conference 2025
Mindalogue: LLM -- Powered Nonlinear Interaction for Effective Learning and Task Exploration
Current generative AI models like ChatGPT, Claude, and Gemini are widely used for knowledge dissemination, task decomposition, and creative thinking. However, their linear interaction methods often force users to repeatedly compare and copy contextual information when handling complex tasks, increasing cognitive load and operational costs. Moreover, the ambiguity in model responses requires users to refine and simplify the information further. To address these issues, we developed "Mindalogue", a system using a non-linear interaction model based on "nodes + canvas" to enhance user efficiency and freedom while generating structured responses. A formative study with 11 users informed the design of Mindalogue, which was then evaluated through a study with 16 participants. The results showed that Mindalogue significantly reduced task steps and improved users' comprehension of complex information. This study highlights the potential of non-linear interaction in improving AI tool efficiency and user experience in the HCI field.
comment: 17 pages, 9 figures. Submitted to CHI 2025
Reflexive Input-Output Causality Mechanisms
This paper explores the concept of reflexive actuation, examining how robots may leverage both internal and external stimuli to trigger changes in the motion, performance, or physical characteristics of the robot, such as its size, shape, or configuration, and so on. These changes themselves may in turn be sequentially re-used as input to drive further adaptations. Drawing inspiration from biological systems, where reflexes are an essential component of the response to environmental changes, reflexive actuation is critical to enable robots to adapt to diverse situations and perform complex tasks. The underlying principles of reflexive actuation are analyzed, with examples provided from existing implementations such as contact-sensitive reflexive arms, physical counters, and their applications. The paper also outlines future directions and challenges for advancing this research area, emphasizing its significance in the development of adaptive, responsive robotic systems.
comment: 9 pages, 5 figures
Consensus in Multiagent Systems with lack of connection
We consider multi-agent systems with cooperative interactions and study the convergence to consensus in the case of time-dependent lack of interaction. We prove a new condition ensuring consensus: we define a graph in which directed arrows correspond to connection functions that converge (in the weak sense) to some function with a positive integral on all intervals of the form $[t,+\infty)$. If the graph has a vertex reachable from all other indices, then the system converges to consensus. We show that this requirement generalizes some known sufficient conditions for convergence, such as the Persistent Excitation one. We also give a second new condition, transversal to the known ones: total connectedness of the undirected graph formed by the non-vanishing of limiting functions.
Robust co-design framework for buildings operated by predictive control
Cost-effective decarbonisation of the built environment is a stepping stone to achieving net-zero carbon emissions since buildings are globally responsible for more than a quarter of global energy-related CO$_2$ emissions. Improving energy utilization and decreasing costs naturally requires considering multiple domain-specific performance criteria. The resulting problem is often computationally infeasible. The paper proposes an approach based on decomposition and selection of significant operating conditions to achieve a formulation with reduced computational complexity. We present a robust framework to optimise the physical design, the controller, and the operation of residential buildings in an integrated fashion, considering external weather conditions and time-varying electricity prices. The framework explicitly includes operational constraints and increases the utilization of the energy generated by intermittent resources. A case study illustrates the potential of co-design in enhancing the reliability, flexibility and self-sufficiency of a system operating under different conditions. Specifically, numerical results demonstrate reductions in costs up to $30$% compared to a deterministic formulation. Furthermore, the proposed approach achieves a computational time reduction of at least $10$ times lower compared to the original problem with a deterioration in the performance of only 0.6%.
Towards more realistic co-simulation of cyber-physical energy distribution systems
The increased integration of information and communications technology at the distribution grid level offers broader opportunities for active operational management concepts. At the same time, requirements for resilience against internal and external threats to the power supply, such as outages or cyberattacks, are increasing. The emerging threat landscape needs to be investigated to ensure the security of supply of future distribution grids. This extended abstract presents a co-simulation environment to study communication infrastructures for the resilient operation of distribution grids. For this purpose, a communication network emulation and a power grid simulation are combined in a common modular environment. This will provide the basis for cybersecurity investigations and testing of new active operation management concepts for smart grids. Exemplary laboratory tests and attack replications will be used to demonstrate the diverse use cases of our co-simulation approach.
comment: Published at: IFAC Conference on Networked Systems 2022
Cooperative nonlinear distributed model predictive control with dissimilar control horizons
In this paper, we introduce a nonlinear distributed model predictive control (DMPC) algorithm, which allows for dissimilar and time-varying control horizons among agents, thereby addressing a common limitation in current DMPC schemes. We consider cooperative agents with varying computational capabilities and operational objectives, each willing to manage varying numbers of optimization variables at each time step. Recursive feasibility and a non-increasing evolution of the optimal cost are proven for the proposed algorithm. Through numerical simulations on systems with three agents, we show that our approach effectively approximates the performance of traditional DMPC, while reducing the number of variables to be optimized. This advancement paves the way for a more decentralized yet coordinated control strategy in various applications, including power systems and traffic management.
comment: 6 pages
Coupled autoregressive active inference agents for control of multi-joint dynamical systems
We propose an active inference agent to identify and control a mechanical system with multiple bodies connected by joints. This agent is constructed from multiple scalar autoregressive model-based agents, coupled together by virtue of sharing memories. Each subagent infers parameters through Bayesian filtering and controls by minimizing expected free energy over a finite time horizon. We demonstrate that a coupled agent of this kind is able to learn the dynamics of a double mass-spring-damper system, and drive it to a desired position through a balance of explorative and exploitative actions. It outperforms the uncoupled subagents in terms of surprise and goal alignment.
comment: 14 pages, 3 figures, accepted to the International Workshop on Active Inference 2024
Efficiently Obtaining Reachset Conformance for the Formal Analysis of Robotic Contact Tasks IROS 2024
Formal verification of robotic tasks requires a simple yet conformant model of the used robot. We present the first work on generating reachset conformant models for robotic contact tasks considering hybrid (mixed continuous and discrete) dynamics. Reachset conformance requires that the set of reachable outputs of the abstract model encloses all previous measurements to transfer safety properties. Aiming for industrial applications, we describe the system using a simple hybrid automaton with linear dynamics. We inject non-determinism into the continuous dynamics and the discrete transitions, and we optimally identify all model parameters together with the non-determinism required to capture the recorded behaviors. Using two 3-DOF robots, we show that our approach can effectively generate models to capture uncertainties in system behavior and substantially reduce the required testing effort in industrial applications.
comment: Accepted at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Robust Tracking Control with Neural Network Dynamic Models under Input Perturbations
Robust control problem has significant practical implication since external disturbances can significantly impact the performance of control method. Existing robust control method excels at control-affine system but fails at neural network dynamic models. Developing robust control methods for such systems remains a complex challenge. In this paper, we focus on robust tracking method for neural network dynamic models. We first propose reachability analysis tool designed for this system and then introduce how to reformulate robust tracking problem with the reachable sets. In addition, we prove the existence of feedback policy that bounds the growth of reachable set over infinite horizon. The effectiveness of proposed approach is validated through numerical tracking task simulations, where we compare it with a standard tube MPC method.
comment: 8 pages, 8 figures, conference
Sequential drone routing for data assimilation on a 2D airborne contaminant dispersion problem
The combined use of data from different sources can be critical in emergencies, where accurate models are needed to make real-time decisions, but high-fidelity representations and detailed information are simply unavailable. This study presents a data assimilation framework based on an ensemble Kalman filter that sequentially exploits and improves an advection-diffusion model in a case study concerning an airborne contaminant dispersion problem over a complex two-dimensional domain. An autonomous aerial drone is used to sequentially observe the actual contaminant concentration in a small fraction of the domain, orders of magnitude smaller than the total domain area. Such observations are synchronized with the data assimilation framework, iteratively adjusting the simulation. The path of the drone is sequentially optimized by balancing exploration and exploitation according to the available knowledge at each decision time. Starting from an erroneous initial model based on approximated assumptions that represent the limited initial knowledge available during emergency scenarios, results show how the proposed framework sequentially improves its belief about the dispersion dynamics, thus providing a reliable contaminant concentration map.
Detection of High-Impedance Low-Current Arc Faults at Electrical Substations
Arcing faults in low voltage (LV) distribution systems associated with arc-flash risk and potentially significant equipment damage are notoriously difficult to detect under some conditions. Especially so when attempting to detect using sensing at the line, high voltage side of a substation transformer. This paper presents an analytics-based physics-aware approach to detect high-impedance, low-current arcing faults from the primary side of the substation transformer at current thresholds, below normal operating events, along with transformer inrush currents. The proposed methodology leverages the Hankel Alternative View Of Koopman Operator approach to differentiate arcing faults from standard operations, while the Series2Graph method is employed to identify the time of fault occurrence and duration. Unlike prior studies that detect such faults at the device or secondary transformer side, this work demonstrates successful fault detection at the primary side of the distribution substation transformer for faults occurring on the secondary side. The approach addresses the practical challenges of differentiating primary side expected and acceptable transients from similar magnitude LV arcing fault currents that may occur on the secondary side. The results demonstrate the efficacy of the proposed method in accurately identifying fault occurrence and duration, minimizing the risk of false positives during similar characteristic events, thus improving the reliability and operational efficiency of power distribution systems. This approach can benefit both traditional and smart power grids that employ similar transformer configurations.
Analysis of Wind Power Integration in Electricity Markets LMP Pricing
Wind energy has emerged as one of the most vital and economically viable forms of renewable energy. The integration of wind energy sources into power grids across the globe has been increasing substantially, largely due to the higher levels of uncertainty associated with wind energy compared to other renewable energy sources. This study focuses on analyzing the Locational Marginal Pricing (LMP) market model, with particular emphasis on the integration of wind power plants into substations. Furthermore, it examines a two-stage stochastic model for electricity markets employing LMP pricing, utilizing the Optimal Power Flow (OPF) method for the analysis.
Recursively Feasible Stochastic Model Predictive Control for Time-Varying Linear Systems Subject to Unbounded Disturbances
Model predictive control solves a constrained optimization problem online in order to compute an implicit closed-loop control policy. Recursive feasibility -- guaranteeing that the optimal control problem will have a solution at every time step -- is an important property to guarantee the success of any model predictive control approach. However, recursive feasibility is difficult to establish in a stochastic setting and, in particular, in the presence of disturbances having unbounded support (e.g., Gaussian noise). The problem is further exacerbated for time-varying systems, in which case recursive feasibility must be established also in a robust sense, over all possible future time-varying parameter values, as well as in a stochastic sense, over all potential disturbance realizations. This work presents a method for ensuring the recursive feasibility of a convex, affine-feedback stochastic model predictive control problem formulation for systems with time-varying system matrices and unbounded disturbances using ideas from covariance steering stochastic model predictive control. It is additionally shown that the proposed approach ensures the closed-loop operation of the system will satisfy the desired chance constraints in practice, and that the stochastic model predictive control problem may be formulated as a convex program so that it may be efficiently solved in real-time.
Multi-Objective Multidisciplinary Optimization of Wave Energy Converter Array Layout and Controls
This study utilizes multidisciplinary design optimization (MDO) to design an array of heaving wave energy converters (WECs) for grid-scale energy production with decision variables and parameters chosen from the coupled disciplines of geometry, hydrodynamics, layout, motor-actuated reactive controls (with a force maximum constraint) and economics. We vary a WEC's dimensions, array layout, and control gain to minimize two objectives: the levelized cost of energy (LCOE) and the maximum separation distance. This multi-objective optimization approach results in a set of optimal design configurations that stakeholders can choose from for their specific application and needs. The framework yields a range of optimal (minimum) LCOE values from 0.21 to 0.23 \$/kWh and a separation distance ranging from 97 to 62 meters. The WEC radius of 4m is found to be optimal, and the q-factor for optimal designs are greater than 1 up to 1.06 for a rhombus-like layout. Additionally, a post-optimality global sensitivity analysis of a design shows that wave heading, wave frequency, WEC lifetime, amplitude and interest rate accounts for most of the variance. Different designs in the Pareto set may be appealing for different decision makers based on their trade-off analysis. To that end, regression model is developed for design heuristics.
A Structural Analysis of the User Behavior Dynamics for Environmentally Sustainable ICT
The sector of information and communication technology (ICT) can contribute to the fulfillment of the Paris agreement and the sustainable development goals (SDGs) through the introduction of sustainability strategies. For environmental sustainability, such strategies should contain efficiency, sufficiency, and consistency measures. To propose such, a structural analysis of ICT is undertaken in this manuscript. Thereby, key mechanisms and dynamics behind the usage of ICT and the corresponding energy and resource use are analyzed by describing ICT as a complex system. The system contains data centers, communication networks, smartphone hardware, apps, and the behavior of the users as sub-systems, between which various Morinian interactions are present. Energy and non-energy resources can be seen as inputs of the system, while e-waste is an output. Based on the system description, we propose multiple measures for efficiency, sufficiency and consistency to reduce greenhouse gas emissions and other environmental impacts.
Automated Discovery of Continuous Dynamics from Videos
Dynamical systems form the foundation of scientific discovery, traditionally modeled with predefined state variables such as the angle and angular velocity, and differential equations such as the equation of motion for a single pendulum. We propose an approach to discover a set of state variables that preserve the smoothness of the system dynamics and to construct a vector field representing the system's dynamics equation, automatically from video streams without prior physical knowledge. The prominence and effectiveness of the proposed approach are demonstrated through both quantitative and qualitative analyses of various dynamical systems, including the prediction of characteristic frequencies and the identification of chaotic and limit cycle behaviors. This shows the potential of our approach to assist human scientists in scientific discovery.
AI-Driven Autonomous Control of Proton-Boron Fusion Reactors Using Backpropagation Neural Networks
Proton-boron (p-11B) fusion presents a promising path towards sustainable, neutron-free energy generation. However, its implementation is hindered by extreme operational conditions, such as plasma temperatures exceeding billions of degrees and the complexity of controlling high-energy particles. Traditional control systems face significant challenges in managing the highly dynamic and non-linear behavior of the plasma. In this paper, we propose a novel approach utilizing backpropagation-based neural networks to autonomously control key parameters in a proton-boron fusion reactor. Our method leverages real-time feedback and learning from physical data to adapt to changing plasma conditions, offering a potential breakthrough in stable and efficient p-11B fusion. Furthermore, we expand on the scalability and generalization of our approach to other fusion systems and future AI technologies.
Constrained Trajectory Optimization on Matrix Lie Groups via Lie-Algebraic Differential Dynamic Programming
Matrix Lie groups are an important class of manifolds commonly used in control and robotics, and optimizing control policies on these manifolds is a fundamental problem. In this work, we propose a novel computationally efficient approach for trajectory optimization on matrix Lie groups using an augmented Lagrangian-based constrained discrete Differential Dynamic Programming (DDP). The method involves lifting the optimization problem to the Lie algebra during the backward pass and retracting back to the manifold during the forward pass. Unlike previous approaches that addressed constraint handling only for specific classes of matrix Lie groups, the proposed method provides a general solution for nonlinear constraint handling across generic matrix Lie groups. We evaluate the effectiveness of the proposed DDP method in handling constraints within a mechanical system characterized by rigid body dynamics in SE(3), assessing its computational efficiency compared to existing direct optimization solvers. Additionally, the method demonstrates robustness under external disturbances when applied as a Lie-algebraic feedback control policy on SE(3), and in optimizing a quadrotor's trajectory in a challenging realistic scenario. Experiments show that the proposed approach effectively manages general constraints defined on configuration, velocity, and inputs during optimization, while also maintaining stability under external disturbances when executing the resultant control policy in closed-loop.
comment: 12 pages, 6 figures
Transformer Temperature Management and Voltage Control in Electric Distribution Systems with High Solar PV Penetration
The increasing penetration of photovoltaic (PV) systems in distribution grids can lead to overvoltage and transformer overloading issues. While voltage regulation has been extensively studied and some research has addressed transformer temperature control, there is limited work on simultaneously managing both challenges. This paper addresses this gap by proposing an optimization-based strategy that efficiently manages voltage regulation and transformer temperature while minimizing the curtailment of PV generation. In order to make this problem convex, a relaxation is applied to the transformer temperature dynamics constraint. We also provide analysis to determine under which conditions this relaxation remains tight. The proposed approach is validated through simulations, demonstrating its effectiveness in achieving the desired control objectives.
Quantum feedback control of a two-atom network closed by a semi-infinite waveguide
The purpose of this paper is to study the delay-dependent coherent feedback dynamics by focusing on one typical realization, i.e., a two-atom quantum network whose feedback loop is closed by a semi-infinite waveguide. In this set-up, an initially excited two-level atom can emit a photon into the waveguide, where the propagating photon can be reflected by the terminal mirror of the waveguide or absorbed by the other atom, thus constructing various coherent feedback loops. We show that there can be two-photon, one-photon or zero-photon states in the waveguide, which can be controlled by the feedback loop length and the coupling strengths between the atoms and waveguide. The photonic states in the waveguide are analyzed in both the frequency domain and the spatial domain, and the transient process of photon emissions is better understood based on a comprehensive analysis using both domains. Interestingly, we clarify that this quantum coherent feedback network can be mathematically modeled as a linear control system with multiple delays, which are determined by the distances between atoms and the terminal mirror of the semi-infinite waveguide. Therefore, based on time-delayed linear control system theory, the influence of delays on the stability of the quantum state evolution and the steady-state atomic and photonic states is investigated, for both small and large delays.
The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies
We study mathematical models of binary direct collinear collisions of convex viscoplastic bodies based on two incremental collision laws that employ the Bouc-Wen differential model of hysteresis to represent the elastoplastic behavior of the materials of the colliding bodies. These collision laws are the Bouc-Wen-Simon-Hunt-Crossley collision law (BWSHCCL) and the Bouc-Wen-Maxwell collision law (BWMCL). The BWSHCCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in parallel to a nonlinear displacement-dependent and rate-dependent energy dissipation element. The BWMCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in series to a linear rate-dependent energy dissipation element. The mathematical models of the collision process are presented in the form of finite-dimensional initial value problems. We show that the models possess favorable analytical properties (e.g., global existence, uniqueness and boundedness of the solutions) under suitable restrictions on the ranges of their parameters. Furthermore, we show that excellent agreement can be achieved between the experimental data and the data from the numerical simulation of the mathematical models across a wide range of initial relative velocities and material properties of the colliding bodies while using parameterizations that are independent of the initial relative velocity.
comment: 15 pages; 4 figures; the associated code/data are available from https://gitlab.com/user9716869/BWBCL; added references and corrected typos
Decompositions of Nonlinear Input-Output Systems to Zero the Output
Consider an input-output system where the output is the tracking error given some desired reference signal. It is natural to consider under what conditions the problem has an exact solution, that is, the tracking error is exactly the zero function. If the system has a well defined relative degree and the zero function is in the range of the input-output map, then it is well known that the system is locally left invertible, and thus, the problem has a unique exact solution. A system will fail to have relative degree when more than one exact solution exists. The general goal of this paper is to describe a decomposition of an input-output system having a Chen-Fliess series representation into a parallel product of subsystems in order to identify possible solutions to the problem of zeroing the output. For computational purposes, the focus is on systems whose generating series are polynomials. It is shown that the shuffle algebra on the set of generating polynomials is a unique factorization domain so that any polynomial can be uniquely factored modulo a permutation into its irreducible elements for the purpose of identifying the subsystems in a parallel product decomposition. This is achieved using the fact that this shuffle algebra is isomorphic to the symmetric algebra over the vector space spanned by Lyndon words. A specific algorithm for factoring generating polynomials into its irreducible factors is presented based on the Chen-Fox-Lyndon factorization of words.
comment: Final version with revised title and abstract
Intent Demonstration in General-Sum Dynamic Games via Iterative Linear-Quadratic Approximations
Autonomous agents should be able to coordinate with other agents without knowing their intents ahead of time. While prior work has studied how agents can gather information about the intent of others, in this work, we study the inverse problem: how agents can demonstrate their intent to others, within the framework of general-sum dynamic games. We first present a model of this intent demonstration problem and then propose an algorithm that enables an agent to trade off their task performance and intent demonstration to improve the overall system's performance. To scale to continuous states and action spaces as well as to nonlinear dynamics and costs, our algorithm leverages linear-quadratic approximations with an efficient intent teaching guarantee. Our empirical results show that intent demonstration accelerates other agents' learning and enables the demonstrating agent to balance task performance with intent expression.
The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies
We study mathematical models of binary direct collinear collisions of convex viscoplastic bodies based on two incremental collision laws that employ the Bouc-Wen differential model of hysteresis to represent the elastoplastic behavior of the materials of the colliding bodies. These collision laws are the Bouc-Wen-Simon-Hunt-Crossley collision law (BWSHCCL) and the Bouc-Wen-Maxwell collision law (BWMCL). The BWSHCCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in parallel to a nonlinear displacement-dependent and rate-dependent energy dissipation element. The BWMCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in series to a linear rate-dependent energy dissipation element. The mathematical models of the collision process are presented in the form of finite-dimensional initial value problems. We show that the models possess favorable analytical properties (e.g., global existence, uniqueness and boundedness of the solutions) under suitable restrictions on the ranges of their parameters. Furthermore, we show that excellent agreement can be achieved between the experimental data and the data from the numerical simulation of the mathematical models across a wide range of initial relative velocities and material properties of the colliding bodies while using parameterizations that are independent of the initial relative velocity.
comment: 15 pages; 4 figures; added references and corrected typos; the associated code/data are available from https://gitlab.com/user9716869/BWBCL
Machine Learning Driven Global Optimisation Framework for Analog Circuit Design
We propose a machine learning-driven optimisation framework for analog circuit design in this paper. The primary objective is to determine the device sizes for the optimal performance of analog circuits for a given set of specifications. Our methodology entails employing machine learning models and spice simulations to direct the optimisation algorithm towards achieving the optimal design for analog circuits. Machine learning based global offline surrogate models, with the circuit design parameters as the input, are built in the design space for the analog circuits under study and is used to guide the optimisation algorithm, resulting in faster convergence and a reduced number of spice simulations. Multi-layer perceptron and random forest regressors are employed to predict the required design specifications of the analog circuit. Since the saturation condition of transistors is vital in the proper working of analog circuits, multi-layer perceptron classifiers are used to predict the saturation condition of each transistor in the circuit. The feasibility of the candidate solutions is verified using machine learning models before invoking spice simulations. We validate the proposed framework using three circuit topologies--a bandgap reference, a folded cascode operational amplifier, and a two-stage operational amplifier. The simulation results show better optimum values and lower standard deviations for fitness functions after convergence. Incorporating the machine learning-based predictions proposed in the optimisation method has resulted in the reduction of spice calls by 56%, 59%, and 83% when compared with standard approaches in the three test cases considered in the study.
Solving Offline Reinforcement Learning with Decision Tree Regression
This study presents a novel approach to addressing offline reinforcement learning (RL) problems by reframing them as regression tasks that can be effectively solved using Decision Trees. Mainly, we introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies (RCDTP and RWDTP), both of which achieve notable speed in agent training as well as inference, with training typically lasting less than a few minutes. Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods. We evaluate our methods on D4RL datasets for locomotion and manipulation, as well as other robotic tasks involving wheeled and flying robots. Additionally, we assess performance in delayed/sparse reward scenarios and highlight the explainability of these policies through action distribution and feature importance.
ShieldNN: A Provably Safe NN Filter for Unsafe NN Controllers
In this paper, we develop a novel closed-form Control Barrier Function (CBF) and associated controller shield for the Kinematic Bicycle Model (KBM) with respect to obstacle avoidance. The proposed CBF and shield -- designed by an algorithm we call ShieldNN -- provide two crucial advantages over existing methodologies. First, ShieldNN considers steering and velocity constraints directly with the non-affine KBM dynamics; this is in contrast to more general methods, which typically consider only affine dynamics and do not guarantee invariance properties under control constraints. Second, ShieldNN provides a closed-form set of safe controls for each state unlike more general methods, which typically rely on optimization algorithms to generate a single instantaneous for each state. Together, these advantages make ShieldNN uniquely suited as an efficient Multi-Obstacle Safe Actions (i.e. multiple-barrier-function shielding) during training time of a Reinforcement Learning (RL) enabled NN controller. We show via experiments that ShieldNN dramatically increases the completion rate of RL training episodes in the presence of multiple obstacles, thus establishing the value of ShieldNN in training RL-based controllers.
Robotics
VQ-CNMP: Neuro-Symbolic Skill Learning for Bi-Level Planning
This paper proposes a novel neural network model capable of discovering high-level skill representations from unlabeled demonstration data. We also propose a bi-level planning pipeline that utilizes our model using a gradient-based planning approach. While extracting high-level representations, our model also preserves the low-level information, which can be used for low-level action planning. In the experiments, we tested the skill discovery performance of our model under different conditions, tested whether Multi-Modal LLMs can be utilized to label the learned high-level skill representations, and finally tested the high-level and low-level planning performance of our pipeline.
comment: 12 pages, 6 figures, Submitted to Conference on Robot Learning LEAP Workshop 2024
REPeat: A Real2Sim2Real Approach for Pre-acquisition of Soft Food Items in Robot-assisted Feeding
The paper presents REPeat, a Real2Sim2Real framework designed to enhance bite acquisition in robot-assisted feeding for soft foods. It uses `pre-acquisition actions' such as pushing, cutting, and flipping to improve the success rate of bite acquisition actions such as skewering, scooping, and twirling. If the data-driven model predicts low success for direct bite acquisition, the system initiates a Real2Sim phase, reconstructing the food's geometry in a simulation. The robot explores various pre-acquisition actions in the simulation, then a Sim2Real step renders a photorealistic image to reassess success rates. If the success improves, the robot applies the action in reality. We evaluate the system on 15 diverse plates with 10 types of food items for a soft food diet, showing improvement in bite acquisition success rates by 27\% on average across all plates. See our project website at https://emprise.cs.cornell.edu/repeat.
Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions
Recent advancements in Model-Based Reinforcement Learning (MBRL) have made it a powerful tool for visual control tasks. Despite improved data efficiency, it remains challenging to train MBRL agents with generalizable perception. Training in the presence of visual distractions is particularly difficult due to the high variation they introduce to representation learning. Building on DREAMER, a popular MBRL method, we propose a simple yet effective auxiliary task to facilitate representation learning in distracting environments. Under the assumption that task-relevant components of image observations are straightforward to identify with prior knowledge in a given task, we use a segmentation mask on image observations to only reconstruct task-relevant components. In doing so, we greatly reduce the complexity of representation learning by removing the need to encode task-irrelevant objects in the latent representation. Our method, Segmentation Dreamer (SD), can be used either with ground-truth masks easily accessible in simulation or by leveraging potentially imperfect segmentation foundation models. The latter is further improved by selectively applying the reconstruction loss to avoid providing misleading learning signals due to mask prediction errors. In modified DeepMind Control suite (DMC) and Meta-World tasks with added visual distractions, SD achieves significantly better sample efficiency and greater final performance than prior work. We find that SD is especially helpful in sparse reward tasks otherwise unsolvable by prior work, enabling the training of visually robust agents without the need for extensive reward engineering.
Conformalized Reachable Sets for Obstacle Avoidance With Spheres
Safe motion planning algorithms are necessary for deploying autonomous robots in unstructured environments. Motion plans must be safe to ensure that the robot does not harm humans or damage any nearby objects. Generating these motion plans in real-time is also important to ensure that the robot can adapt to sudden changes in its environment. Many trajectory optimization methods introduce heuristics that balance safety and real-time performance, potentially increasing the risk of the robot colliding with its environment. This paper addresses this challenge by proposing Conformalized Reachable Sets for Obstacle Avoidance With Spheres (CROWS). CROWS is a novel real-time, receding-horizon trajectory planner that generates probalistically-safe motion plans. Offline, CROWS learns a novel neural network-based representation of a spherebased reachable set that overapproximates the swept volume of the robot's motion. CROWS then uses conformal prediction to compute a confidence bound that provides a probabilistic safety guarantee on the learned reachable set. At runtime, CROWS performs trajectory optimization to select a trajectory that is probabilstically-guaranteed to be collision-free. We demonstrate that CROWS outperforms a variety of state-of-the-art methods in solving challenging motion planning tasks in cluttered environments while remaining collision-free. Code, data, and video demonstrations can be found at https://roahmlab.github.io/crows/
comment: https://roahmlab.github.io/crows/
Markerless Aerial-Terrestrial Co-Registration of Forest Point Clouds using a Deformable Pose Graph
For biodiversity and forestry applications, end-users desire maps of forests that are fully detailed, from the forest floor to the canopy. Terrestrial laser scanning and aerial laser scanning are accurate and increasingly mature methods for scanning the forest. However, individually they are not able to estimate attributes such as tree height, trunk diameter and canopy density due to the inherent differences in their field-of-view and mapping processes. In this work, we present a pipeline that can automatically generate a single joint terrestrial and aerial forest reconstruction. The novelty of the approach is a marker-free registration pipeline, which estimates a set of relative transformation constraints between the aerial cloud and terrestrial sub-clouds without requiring any co-registration reflective markers to be physically placed in the scene. Our method then uses these constraints in a pose graph formulation, which enables us to finely align the respective clouds while respecting spatial constraints introduced by the terrestrial SLAM scanning process. We demonstrate that our approach can produce a fine-grained and complete reconstruction of large-scale natural environments, enabling multi-platform data capture for forestry applications without requiring external infrastructure.
Physics-informed Neural Mapping and Motion Planning in Unknown Environments
Mapping and motion planning are two essential elements of robot intelligence that are interdependent in generating environment maps and navigating around obstacles. The existing mapping methods create maps that require computationally expensive motion planning tools to find a path solution. In this paper, we propose a new mapping feature called arrival time fields, which is a solution to the Eikonal equation. The arrival time fields can directly guide the robot in navigating the given environments. Therefore, this paper introduces a new approach called Active Neural Time Fields (Active NTFields), which is a physics-informed neural framework that actively explores the unknown environment and maps its arrival time field on the fly for robot motion planning. Our method does not require any expert data for learning and uses neural networks to directly solve the Eikonal equation for arrival time field mapping and motion planning. We benchmark our approach against state-of-the-art mapping and motion planning methods and demonstrate its superior performance in both simulated and real-world environments with a differential drive robot and a 6 degrees-of-freedom (DOF) robot manipulator. The supplementary videos can be found at https://youtu.be/qTPL5a6pRKk, and the implementation code repository is available at https://github.com/Rtlyc/antfields-demo.
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Visual navigation is an essential skill for home-assistance robots, providing the object-searching ability to accomplish long-horizon daily tasks. Many recent approaches use Large Language Models (LLMs) for commonsense inference to improve exploration efficiency. However, the planning process of LLMs is limited within texts and it is difficult to represent the spatial occupancy and geometry layout only by texts. Both are important for making rational navigation decisions. In this work, we seek to unleash the spatial perception and planning ability of Vision-Language Models (VLMs), and explore whether the VLM, with only on-board camera captured RGB/RGB-D stream inputs, can efficiently finish the visual navigation tasks in a mapless manner. We achieve this by developing the imagination-powered navigation framework ImagineNav, which imagines the future observation images at valuable robot views and translates the complex navigation planning process into a rather simple best-view image selection problem for VLM. To generate appropriate candidate robot views for imagination, we introduce the Where2Imagine module, which is distilled to align with human navigation habits. Finally, to reach the VLM preferred views, an off-the-shelf point-goal navigation policy is utilized. Empirical experiments on the challenging open-vocabulary object navigation benchmarks demonstrates the superiority of our proposed system.
comment: 17 pages, 9 figures
Generating Driving Simulations via Conversation
Cyber-physical systems like autonomous vehicles are tested in simulation before deployment, using domain-specific programs for scenario specification. To aid the testing of autonomous vehicles in simulation, we design a natural language interface, using an instruction-following large language model, to assist a non-coding domain expert in synthesising the desired scenarios and vehicle behaviours. We show that using it to convert utterances to the symbolic program is feasible, despite the very small training dataset. Human experiments show that dialogue is critical to successful simulation generation, leading to a 4.5 times higher success rate than a generation without engaging in extended conversation.
comment: 6 pages, 6 figures, 2 tables
Socially Aware Motion Planning for Service Robots Using LiDAR and RGB-D Camera
Service robots that work alongside humans in a shared environment need a navigation system that takes into account not only physical safety but also social norms for mutual cooperation. In this paper, we introduce a motion planning system that includes human states such as positions and velocities and their personal space for social-aware navigation. The system first extracts human positions from the LiDAR and the RGB-D camera. It then uses the Kalman filter to fuse that information for human state estimation. An asymmetric Gaussian function is then employed to model human personal space based on their states. This model is used as the input to the dynamic window approach algorithm to generate trajectories for the robot. Experiments show that the robot is able to navigate alongside humans in a dynamic environment while respecting their physical and psychological comfort.
comment: In Proceedings of 2024, the 7th International Conference on Control, Robotics and Informatics (ICCRI 2024)
Model Predictive Control for Optimal Motion Planning of Unmanned Aerial Vehicles
Motion planning is an essential process for the navigation of unmanned aerial vehicles (UAVs) where they need to adapt to obstacles and different structures of their operating environment to reach the goal. This paper presents an optimal motion planner for UAVs operating in unknown complex environments. The motion planner receives point cloud data from a local range sensor and then converts it into a voxel grid representing the surrounding environment. A local trajectory guiding the UAV to the goal is then generated based on the voxel grid. This trajectory is further optimized using model predictive control (MPC) to enhance the safety, speed, and smoothness of UAV operation. The optimization is carried out via the definition of several cost functions and constraints, taking into account the UAV's dynamics and requirements. A number of simulations and comparisons with a state-of-the-art method have been conducted in a complex environment with many obstacles to evaluate the performance of our method. The results show that our method provides not only shorter and smoother trajectories but also faster and more stable speed profiles. It is also energy efficient making it suitable for various UAV applications.
comment: In proceedings of 2024, the 7th International Conference on Control, Robotics and Informatics (ICCRI 2024)
t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Given the wide adoption of multimodal sensors (e.g., camera, lidar, radar) by autonomous vehicles (AVs), deep analytics to fuse their outputs for a robust perception become imperative. However, existing fusion methods often make two assumptions rarely holding in practice: i) similar data distributions for all inputs and ii) constant availability for all sensors. Because, for example, lidars have various resolutions and failures of radars may occur, such variability often results in significant performance degradation in fusion. To this end, we present tREADi, an adaptive inference system that accommodates the variability of multimodal sensory data and thus enables robust and efficient perception. t-READi identifies variation-sensitive yet structure-specific model parameters; it then adapts only these parameters while keeping the rest intact. t-READi also leverages a cross-modality contrastive learning method to compensate for the loss from missing modalities. Both functions are implemented to maintain compatibility with existing multimodal deep fusion methods. The extensive experiments evidently demonstrate that compared with the status quo approaches, t-READi not only improves the average inference accuracy by more than 6% but also reduces the inference latency by almost 15x with the cost of only 5% extra memory overhead in the worst case under realistic data and modal variations.
comment: 15 pages, 16 figures
Gaussian Splatting Visual MPC for Granular Media Manipulation
Recent advancements in learned 3D representations have enabled significant progress in solving complex robotic manipulation tasks, particularly for rigid-body objects. However, manipulating granular materials such as beans, nuts, and rice, remains challenging due to the intricate physics of particle interactions, high-dimensional and partially observable state, inability to visually track individual particles in a pile, and the computational demands of accurate dynamics prediction. Current deep latent dynamics models often struggle to generalize in granular material manipulation due to a lack of inductive biases. In this work, we propose a novel approach that learns a visual dynamics model over Gaussian splatting representations of scenes and leverages this model for manipulating granular media via Model-Predictive Control. Our method enables efficient optimization for complex manipulation tasks on piles of granular media. We evaluate our approach in both simulated and real-world settings, demonstrating its ability to solve unseen planning tasks and generalize to new environments in a zero-shot transfer. We also show significant prediction and manipulation performance improvements compared to existing granular media manipulation methods.
comment: project website https://weichengtseng.github.io/gs-granular-mani/
Flying Quadrotors in Tight Formations using Learning-based Model Predictive Control
Flying quadrotors in tight formations is a challenging problem. It is known that in the near-field airflow of a quadrotor, the aerodynamic effects induced by the propellers are complex and difficult to characterize. Although machine learning tools can potentially be used to derive models that capture these effects, these data-driven approaches can be sample inefficient and the resulting models often do not generalize as well as their first-principles counterparts. In this work, we propose a framework that combines the benefits of first-principles modeling and data-driven approaches to construct an accurate and sample efficient representation of the complex aerodynamic effects resulting from quadrotors flying in formation. The data-driven component within our model is lightweight, making it amenable for optimization-based control design. Through simulations and physical experiments, we show that incorporating the model into a novel learning-based nonlinear model predictive control (MPC) framework results in substantial performance improvements in terms of trajectory tracking and disturbance rejection. In particular, our framework significantly outperforms nominal MPC in physical experiments, achieving a 40.1% improvement in the average trajectory tracking errors and a 57.5% reduction in the maximum vertical separation errors. Our framework also achieves exceptional sample efficiency, using only a total of 46 seconds of flight data for training across both simulations and physical experiments. Furthermore, with our proposed framework, the quadrotors achieve an exceptionally tight formation, flying with an average separation of less than 1.5 body lengths throughout the flight. A video illustrating our framework and physical experiments is given here: https://youtu.be/Hv-0JiVoJGo
comment: 7 pages, 5 figures
Technical Design Review of Duke Robotics Club's Oogway: An AUV for RoboSub 2024
The Duke Robotics Club is proud to present our robot for the 2024 RoboSub Competition: Oogway. Now in its second year, Oogway has been dramatically upgraded in both its capabilities and reliability. Oogway was built on the principle of independent, well-integrated, and reliable subsystems. Individual components and subsystems were tested and designed separately. Oogway's most advanced capabilities are a result of the tight integration between these subsystems. Such examples include a re-envisioned controls system, an entirely new electrical stack, advanced sonar integration, additional cameras and system monitoring, a new marker dropper, and a watertight capsule mechanism. These additions enabled Oogway to prequalify for Robosub 2024.
LoRD: Adapting Differentiable Driving Policies to Distribution Shifts
Distribution shifts between operational domains can severely affect the performance of learned models in self-driving vehicles (SDVs). While this is a well-established problem, prior work has mostly explored naive solutions such as fine-tuning, focusing on the motion prediction task. In this work, we explore novel adaptation strategies for differentiable autonomy stacks consisting of prediction, planning, and control, perform evaluation in closed-loop, and investigate the often-overlooked issue of catastrophic forgetting. Specifically, we introduce two simple yet effective techniques: a low-rank residual decoder (LoRD) and multi-task fine-tuning. Through experiments across three models conducted on two real-world autonomous driving datasets (nuPlan, exiD), we demonstrate the effectiveness of our methods and highlight a significant performance gap between open-loop and closed-loop evaluation in prior approaches. Our approach improves forgetting by up to 23.33% and the closed-loop OOD driving score by 8.83% in comparison to standard fine-tuning.
comment: Under Review
Stability and Transparency in Mixed Reality Bilateral Human Teleoperation
Recent work introduced the concept of human teleoperation (HT), where the remote robot typically considered in conventional bilateral teleoperation is replaced by a novice person wearing a mixed reality head mounted display and tracking the motion of a virtual tool controlled by an expert. HT has advantages in cost, complexity, and patient acceptance for telemedicine in low-resource communities or remote locations. However, the stability, transparency, and performance of bilateral HT are unexplored. In this paper, we therefore develop a mathematical model and simulation of the HT system using test data. We then analyze various control architectures with this model and implement them with the HT system to find the achievable performance, investigate stability, and determine the most promising teleoperation scheme in the presence of time delays. We show that instability in HT, while not destructive or dangerous, makes the system impossible to use. However, stable and transparent teleoperation are possible with small time delays (<200 ms) through 3-channel teleoperation, or with large time delays through model-mediated teleoperation with local pose and force feedback for the novice.
Oogway: Designing, Implementing, and Testing an AUV for RoboSub 2023
The Duke Robotics Club is proud to present our robot for the 2023 RoboSub Competition: Oogway. Oogway marks one of the largest design overhauls in club history. Beyond a revamped formfactor, some of Oogway's notable features include all-new computer vision software, advanced sonar integration, novel acoustics hardware processing, and upgraded stereoscopic cameras. Oogway was built on the principle of independent, well-integrated, and reliable subsystems. Individual components and subsystems were tested and designed separately. Oogway's most advanced capabilities are a result of the tight integration between these subsystems. Such examples include sonar-assisted computer vision algorithms and robot-agnostic controls configured in part through the robot's 3D model. The success of constructing and testing Oogway in under 2 year's time can be attributed to 20+ contributing club members, supporters within Duke's Pratt School of Engineering, and outside sponsors.
comment: arXiv admin note: text overlap with arXiv:2410.09684
Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space NeurIPS 2024
Even though a variety of methods have been proposed in the literature, efficient and effective latent-space control (i.e., control in a learned low-dimensional space) of physical systems remains an open challenge. We argue that a promising avenue is to leverage powerful and well-understood closed-form strategies from control theory literature in combination with learned dynamics, such as potential-energy shaping. We identify three fundamental shortcomings in existing latent-space models that have so far prevented this powerful combination: (i) they lack the mathematical structure of a physical system, (ii) they do not inherently conserve the stability properties of the real systems, (iii) these methods do not have an invertible mapping between input and latent-space forcing. This work proposes a novel Coupled Oscillator Network (CON) model that simultaneously tackles all these issues. More specifically, (i) we show analytically that CON is a Lagrangian system - i.e., it possesses well-defined potential and kinetic energy terms. Then, (ii) we provide formal proof of global Input-to-State stability using Lyapunov arguments. Moving to the experimental side, we demonstrate that CON reaches SoA performance when learning complex nonlinear dynamics of mechanical systems directly from images. An additional methodological innovation contributing to achieving this third goal is an approximated closed-form solution for efficient integration of network dynamics, which eases efficient training. We tackle (iii) by approximating the forcing-to-input mapping with a decoder that is trained to reconstruct the input based on the encoded latent space force. Finally, we show how these properties enable latent-space control. We use an integral-saturated PID with potential force compensation and demonstrate high-quality performance on a soft robot using raw pixels as the only feedback information.
comment: 38th Conference on Neural Information Processing Systems (NeurIPS 2024) spotlight, 49 pages
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations
Shared dynamics models are important for capturing the complexity and variability inherent in Human-Robot Interaction (HRI). Therefore, learning such shared dynamics models can enhance coordination and adaptability to enable successful reactive interactions with a human partner. In this work, we propose a novel approach for learning a shared latent space representation for HRIs from demonstrations in a Mixture of Experts fashion for reactively generating robot actions from human observations. We train a Variational Autoencoder (VAE) to learn robot motions regularized using an informative latent space prior that captures the multimodality of the human observations via a Mixture Density Network (MDN). We show how our formulation derives from a Gaussian Mixture Regression formulation that is typically used approaches for learning HRI from demonstrations such as using an HMM/GMM for learning a joint distribution over the actions of the human and the robot. We further incorporate an additional regularization to prevent "mode collapse", a common phenomenon when using latent space mixture models with VAEs. We find that our approach of using an informative MDN prior from human observations for a VAE generates more accurate robot motions compared to previous HMM-based or recurrent approaches of learning shared latent representations, which we validate on various HRI datasets involving interactions such as handshakes, fistbumps, waving, and handovers. Further experiments in a real-world human-to-robot handover scenario show the efficacy of our approach for generating successful interactions with four different human interaction partners.
comment: Preprint version of paper accepted at IEEE RAL. Project URL: https://bit.ly/MoVEInt
CarbonFish -- A Bistable Underactuated Compliant Fish Robot capable of High Frequency Undulation
The Hair Clip Mechanism HCM represents an innovative in plane prestressed bistable mechanism, as delineated in our preceding studies, devised to augment the functional prowess of soft robotics. When juxtaposed with conventional soft and compliant robotic systems, HCMs exhibit pronounced rigidity, augmented mobility, reproducible repeatability, and an effective design and fabrication paradigm. In this research, we investigate the feasibility of utilizing carbon fiber reinforced plastic CFRP as the foundational material for an HCM based fish robot, herein referred to as CarbonFish. Our objective centers on realizing high frequency undulatory motion, thereby laying the groundwork for accelerated aquatic locomotion in subsequent models. We proffer an exhaustive design and fabrication schema underpinned by mathematical principles. Preliminary evaluations of our single actuated CarbonFish have evidenced an undulation frequency approaching 10 Hz, suggesting its potential to outperform other biologically inspired aquatic entities as well as real fish.
A Parameter Privacy-Preserving Strategy for Mixed-Autonomy Platoon Control
It has been demonstrated that leading cruise control (LCC) can improve the operation of mixed-autonomy platoons by allowing connected and automated vehicles (CAVs) to make longitudinal control decisions based on the information provided by surrounding vehicles. However, LCC generally requires surrounding human-driven vehicles (HDVs) to share their real-time states, which can be used by adversaries to infer drivers' car-following behavior, potentially leading to financial losses or safety concerns. This paper aims to address such privacy concerns and protect the behavioral characteristics of HDVs by devising a parameter privacy-preserving approach for mixed-autonomy platoon control. First, we integrate a parameter privacy filter into LCC to protect sensitive car-following parameters. The privacy filter allows each vehicle to generate seemingly realistic pseudo states by distorting the true parameters to pseudo parameters, which can protect drivers' privacy in behavioral parameters without significantly influencing the control performance. Second, to enhance the reliability and practicality of the privacy filter within LCC, we first introduce an individual-level parameter privacy preservation constraint to the privacy filter, focusing on the privacy level of each individual parameter pair. Subsequently, we extend the current approach to accommodate continuous parameter spaces through a neural network estimator. Third, analysis of head-to-tail string stability reveals the potential impact of privacy filters in degrading mixed traffic flow performance. Simulation shows that this approach can effectively trade off privacy and control performance in LCC. We further demonstrate the benefit of such an approach in networked systems, i.e., by applying the privacy filter to a preceding vehicle, one can also achieve a certain level of privacy for the following vehicle.
Towards Open-World Grasping with Large Vision-Language Models
The ability to grasp objects in-the-wild from open-ended language instructions constitutes a fundamental challenge in robotics. An open-world grasping system should be able to combine high-level contextual with low-level physical-geometric reasoning in order to be applicable in arbitrary scenarios. Recent works exploit the web-scale knowledge inherent in large language models (LLMs) to plan and reason in robotic context, but rely on external vision and action models to ground such knowledge into the environment and parameterize actuation. This setup suffers from two major bottlenecks: a) the LLM's reasoning capacity is constrained by the quality of visual grounding, and b) LLMs do not contain low-level spatial understanding of the world, which is essential for grasping in contact-rich scenarios. In this work we demonstrate that modern vision-language models (VLMs) are capable of tackling such limitations, as they are implicitly grounded and can jointly reason about semantics and geometry. We propose OWG, an open-world grasping pipeline that combines VLMs with segmentation and grasp synthesis models to unlock grounded world understanding in three stages: open-ended referring segmentation, grounded grasp planning and grasp ranking via contact reasoning, all of which can be applied zero-shot via suitable visual prompting mechanisms. We conduct extensive evaluation in cluttered indoor scene datasets to showcase OWG's robustness in grounding from open-ended language, as well as open-world robotic grasping experiments in both simulation and hardware that demonstrate superior performance compared to previous supervised and zero-shot LLM-based methods. Project material is available at https://gtziafas.github.io/OWG_project/ .
comment: 8th Conference on Robot Learning (CoRL 2024), Munich, Germany
A Scalable and Parallelizable Digital Twin Framework for Sustainable Sim2Real Transition of Multi-Agent Reinforcement Learning Systems
Multi-agent reinforcement learning (MARL) systems usually require significantly long training times due to their inherent complexity. Furthermore, deploying them in the real world demands a feature-rich environment along with multiple embodied agents, which may not be feasible due to budget or space limitations, not to mention energy consumption and safety issues. This work tries to address these pain points by presenting a sustainable digital twin framework capable of accelerating MARL training by selectively scaling parallelized workloads on-demand, and transferring the trained policies from simulation to reality using minimal hardware resources. The applicability of the proposed digital twin framework is highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of agent and environment parallelization on training time and that of systematic domain randomization on zero-shot sim2real transfer across both the case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and as low as 2.9% sim2real gap using the suggested deployment method.
Learning-on-the-Drive: Self-supervised Adaptation of Visual Offroad Traversability Models IROS 2024
Autonomous offroad driving is essential for applications like emergency rescue, military operations, and agriculture. Despite progress, systems struggle with high-speed vehicles exceeding 10m/s due to the need for accurate long-range (> 50m) perception for safe navigation. Current approaches are limited by sensor constraints; LiDAR-based methods offer precise short-range data but are noisy beyond 30m, while visual models provide dense long-range measurements but falter with unseen scenarios. To overcome these issues, we introduce ALTER, a learning-on-the-drive perception framework that leverages both sensor types. ALTER uses a self-supervised visual model to learn and adapt from near-range LiDAR measurements, improving long-range prediction in new environments without manual labeling. It also includes a model selection module for better sensor failure response and adaptability to known environments. Testing in two real-world settings showed on average 43.4% better traversability prediction than LiDAR-only and 164% over non-adaptive state-of-the-art (SOTA) visual semantic methods after 45 seconds of online learning.
comment: 8 pages, IROS 2024
S.T.A.R.-Track: Latent Motion Models for End-to-End 3D Object Tracking with Adaptive Spatio-Temporal Appearance Representations
Following the tracking-by-attention paradigm, this paper introduces an object-centric, transformer-based framework for tracking in 3D. Traditional model-based tracking approaches incorporate the geometric effect of object- and ego motion between frames with a geometric motion model. Inspired by this, we propose S.T.A.R.-Track, which uses a novel latent motion model (LMM) to additionally adjust object queries to account for changes in viewing direction and lighting conditions directly in the latent space, while still modeling the geometric motion explicitly. Combined with a novel learnable track embedding that aids in modeling the existence probability of tracks, this results in a generic tracking framework that can be integrated with any query-based detector. Extensive experiments on the nuScenes benchmark demonstrate the benefits of our approach, showing state-of-the-art performance for DETR3D-based trackers while drastically reducing the number of identity switches of tracks at the same time.
comment: \c{opyright} 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
SwarmPRM: Probabilistic Roadmap Motion Planning for Large-Scale Swarm Robotic Systems IROS 2024
Large-scale swarm robotic systems consisting of numerous cooperative agents show considerable promise for performing autonomous tasks across various sectors. Nonetheless, traditional motion planning approaches often face a trade-off between scalability and solution quality due to the exponential growth of the joint state space of robots. In response, this work proposes SwarmPRM, a hierarchical, scalable, computationally efficient, and risk-aware sampling-based motion planning approach for large-scale swarm robots. SwarmPRM utilizes a Gaussian Mixture Model (GMM) to represent the swarm's macroscopic state and constructs a Probabilistic Roadmap in Gaussian space, referred to as the Gaussian roadmap, to generate a transport trajectory of GMM. This trajectory is then followed by each robot at the microscopic stage. To enhance trajectory safety, SwarmPRM incorporates the conditional value-at-risk (CVaR) in the collision checking process to impart the property of risk awareness to the constructed Gaussian roadmap. SwarmPRM then crafts a linear programming formulation to compute the optimal GMM transport trajectory within this roadmap. Extensive simulations demonstrate that SwarmPRM outperforms state-of-the-art methods in computational efficiency, scalability, and trajectory quality while offering the capability to adjust the risk tolerance of generated trajectories.
comment: Accepted by IROS 2024
An Earth Rover dataset recorded at the ICRA@40 party
The ICRA conference is celebrating its $40^{th}$ anniversary in Rotterdam in September 2024, with as highlight the Happy Birthday ICRA Party at the iconic Holland America Line Cruise Terminal. One month later the IROS conference will take place, which will include the Earth Rover Challenge. In this challenge open-world autonomous navigation models are studied truly open-world settings. As part of the Earth Rover Challenge several real-world navigation sets in several cities world-wide, like Auckland, Australia and Wuhan, China. The only dataset recorded in the Netherlands is the small village Oudewater. The proposal is to record a dataset with the robot used in the Earth Rover Challenge in Rotterdam, in front of the Holland America Line Cruise Terminal, before the festivities of the Happy Birthday ICRA Party start. See: https://github.com/SlamMate/vSLAM-on-FrodoBots-2K
comment: 4 pages, presented as Late-Breaking extended abstract to IEEE Conference on Robotics and Automation @40
Map-based Modular Approach for Zero-shot Embodied Question Answering IROS 2024
Embodied Question Answering (EQA) serves as a benchmark task to evaluate the capability of robots to navigate within novel environments and identify objects in response to human queries. However, existing EQA methods often rely on simulated environments and operate with limited vocabularies. This paper presents a map-based modular approach to EQA, enabling real-world robots to explore and map unknown environments. By leveraging foundation models, our method facilitates answering a diverse range of questions using natural language. We conducted extensive experiments in both virtual and real-world settings, demonstrating the robustness of our approach in navigating and comprehending queries within unknown environments.
comment: IROS 2024
A fixed-parameter tractable algorithm for combinatorial filter reduction
What is the minimal information that a robot must retain to achieve its task? To design economical robots, the literature dealing with reduction of combinatorial filters approaches this problem algorithmically. As lossless state compression is NP-hard, prior work has examined, along with minimization algorithms, a variety of special cases in which specific properties enable efficient solution. Complementing those findings, this paper refines the present understanding from the perspective of parameterized complexity. We give a fixed-parameter tractable algorithm for the general reduction problem by exploiting a transformation into clique covering. The transformation introduces new constraints that arise from sequential dependencies encoded within the input filter -- some of these constraints can be repaired, others are treated through enumeration. Through this approach, we identify parameters affecting filter reduction that are based upon inter-constraint couplings (expressed as a notion of their height and width), which add to the structural parameters present in the unconstrained problem of minimal clique covering. Compared with existing work, we precisely identify and quantitatively characterize those features that contribute to the problem's hardness: given a problem instance, the combinatorial core may be a fraction of the instance's full size, with a small subset of constraints needing to be considered, and even those may have directly identifiable couplings that collapse degrees of freedom in the enumeration.
comment: 19 pages, 5 figures
Multiagent Systems
A Multi-LLM Orchestration Engine for Personalized, Context-Rich Assistance
In recent years, large language models have demonstrated remarkable capabilities in natural language understanding and generation. However, these models often struggle with hallucinations and maintaining long term contextual relevance, particularly when dealing with private or local data. This paper presents a novel architecture that addresses these challenges by integrating an orchestration engine that utilizes multiple LLMs in conjunction with a temporal graph database and a vector database. The proposed system captures user interactions, builds a graph representation of conversations, and stores nodes and edges that map associations between key concepts, entities, and behaviors over time. This graph based structure allows the system to develop an evolving understanding of the user preferences, providing personalized and contextually relevant answers. In addition to this, a vector database encodes private data to supply detailed information when needed, allowing the LLM to access and synthesize complex responses. To further enhance reliability, the orchestration engine coordinates multiple LLMs to generate comprehensive answers and iteratively reflect on their accuracy. The result is an adaptive, privacy centric AI assistant capable of offering deeper, more relevant interactions while minimizing the risk of hallucinations. This paper outlines the architecture, methodology, and potential applications of this system, contributing a new direction in personalized, context aware AI assistance.
Crowd IQ -- Aggregating Opinions to Boost Performance AAMAS
We show how the quality of decisions based on the aggregated opinions of the crowd can be conveniently studied using a sample of individual responses to a standard IQ questionnaire. We aggregated the responses to the IQ questionnaire using simple majority voting and a machine learning approach based on a probabilistic graphical model. The score for the aggregated questionnaire, Crowd IQ, serves as a quality measure of decisions based on aggregating opinions, which also allows quantifying individual and crowd performance on the same scale. We show that Crowd IQ grows quickly with the size of the crowd but saturates, and that for small homogeneous crowds the Crowd IQ significantly exceeds the IQ of even their most intelligent member. We investigate alternative ways of aggregating the responses and the impact of the aggregation method on the resulting Crowd IQ. We also discuss Contextual IQ, a method of quantifying the individual participant's contribution to the Crowd IQ based on the Shapley value from cooperative game theory.
comment: Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2012
Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models NeurIPS 2024
The in-context learning (ICL) capability of pre-trained models based on the transformer architecture has received growing interest in recent years. While theoretical understanding has been obtained for ICL in reinforcement learning (RL), the previous results are largely confined to the single-agent setting. This work proposes to further explore the in-context learning capabilities of pre-trained transformer models in competitive multi-agent games, i.e., in-context game-playing (ICGP). Focusing on the classical two-player zero-sum games, theoretical guarantees are provided to demonstrate that pre-trained transformers can provably learn to approximate Nash equilibrium in an in-context manner for both decentralized and centralized learning settings. As a key part of the proof, constructional results are established to demonstrate that the transformer architecture is sufficiently rich to realize celebrated multi-agent game-playing algorithms, in particular, decentralized V-learning and centralized VI-ULCB.
comment: Accepted to NeurIPS 2024
A Scalable and Parallelizable Digital Twin Framework for Sustainable Sim2Real Transition of Multi-Agent Reinforcement Learning Systems
Multi-agent reinforcement learning (MARL) systems usually require significantly long training times due to their inherent complexity. Furthermore, deploying them in the real world demands a feature-rich environment along with multiple embodied agents, which may not be feasible due to budget or space limitations, not to mention energy consumption and safety issues. This work tries to address these pain points by presenting a sustainable digital twin framework capable of accelerating MARL training by selectively scaling parallelized workloads on-demand, and transferring the trained policies from simulation to reality using minimal hardware resources. The applicability of the proposed digital twin framework is highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of agent and environment parallelization on training time and that of systematic domain randomization on zero-shot sim2real transfer across both the case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and as low as 2.9% sim2real gap using the suggested deployment method.
Communication-Efficient Soft Actor-Critic Policy Collaboration via Regulated Segment Mixture
Multi-Agent Reinforcement Learning (MARL) has emerged as a foundational approach for addressing diverse, intelligent control tasks in various scenarios like the Internet of Vehicles, Internet of Things, and Unmanned Aerial Vehicles. However, the widely assumed existence of a central node for centralized, federated learning-assisted MARL might be impractical in highly dynamic environments. This can lead to excessive communication overhead, potentially overwhelming the system. To address these challenges, we design a novel communication-efficient, fully distributed algorithm for collaborative MARL under the frameworks of Soft Actor-Critic (SAC) and Decentralized Federated Learning (DFL), named RSM-MASAC. In particular, RSM-MASAC enhances multi-agent collaboration and prioritizes higher communication efficiency in dynamic systems by incorporating the concept of segmented aggregation in DFL and augmenting multiple model replicas from received neighboring policy segments, which are subsequently employed as reconstructed referential policies for mixing. Distinctively diverging from traditional RL approaches, RSM-MASAC introduces new bounds under the framework of Maximum Entropy Reinforcement Learning (MERL). Correspondingly, it adopts a theory-guided mixture metric to regulate the selection of contributive referential policies, thus guaranteeing soft policy improvement during the communication-assisted mixing phase. Finally, the extensive simulations in mixed-autonomy traffic control scenarios verify the effectiveness and superiority of our algorithm.
Deep Calibration of Multi-Agent Model for Simulating Real-World Stock Trading
Multi-agent market model is a stock trading simulation system, which generates order flow given the agent variable of the model. We study calibrating the agent variable to simulate the order flow of any given historical trading day. In contrast to the traditional calibration that relies on the inefficient iterative search, we propose DeepCal, the first search-free approach that uses deep learning to calibrate multi-agent market model. DeepCal learns from a novel surrogate-trading loss function to address the non-differentiable issue induced by the multi-agent model and introduces a condition-aware variable estimator, adapting the trading simulation to different market conditions to enhance explainability. Through extensive experiments on real order-book data over a whole year, DeepCal has demonstrated comparable simulation accuracy (<0.36 in Kolmogorov-Smirnov statistic) to traditional search-based approaches without the need for variable search, and can effectively capture the correlation between agent variable and multiple market-condition indexes~(PPI, PMI, CPI, market trend and market noise).
Systems and Control (CS)
Efficient ICBased Solutions for Medical Devices and Automotive Radars
This thesis focuses on developing integrated circuit (IC) solutions for medical devices and automotive radars, and is divided into two main parts. Part One presents the design and evaluation of a miniaturized multi chip module (MCM) solution intended to deliver welldefined, charge balanced current stimuli directly to the inner ear. This section emphasizes the design of the supply chip, which includes a DC DC converter. It involves a comprehensive study aimed at optimizing and enhancing the efficiency of the design. Part Two investigates the fundamental principles of designing millimeter wave (mmWave) voltagecontrolled oscillators (VCOs). This section introduces a VCO with stateoftheart performance, showcasing advancements in mmWave technology. Overall, this thesis contributes to both the medical device field and automotive radar technology through innovative IC solutions.
comment: PhD thesis
Improving accuracy and convergence of federated learning edge computing methods for generalized DER forecasting applications in power grid NeurIPS 2022
This proposal aims to develop more accurate federated learning (FL) methods with faster convergence properties and lower communication requirements, specifically for forecasting distributed energy resources (DER) such as renewables, energy storage, and loads in modern, low-carbon power grids. This will be achieved by (i) leveraging recently developed extensions of FL such as hierarchical and iterative clustering to improve performance with non-IID data, (ii) experimenting with different types of FL global models well-suited to time-series data, and (iii) incorporating domain-specific knowledge from power systems to build more general FL frameworks and architectures that can be applied to diverse types of DERs beyond just load forecasting, and with heterogeneous clients.
comment: Presented at the NeurIPS 2022 Tackling Climate Change with Machine Learning workshop
FedECADO: A Dynamical System Model of Federated Learning
Federated learning harnesses the power of distributed optimization to train a unified machine learning model across separate clients. However, heterogeneous data distributions and computational workloads can lead to inconsistent updates and limit model performance. This work tackles these challenges by proposing FedECADO, a new algorithm inspired by a dynamical system representation of the federated learning process. FedECADO addresses non-IID data distribution through an aggregate sensitivity model that reflects the amount of data processed by each client. To tackle heterogeneous computing, we design a multi-rate integration method with adaptive step-size selections that synchronizes active client updates in continuous time. Compared to prominent techniques, including FedProx and FedNova, FedECADO achieves higher classification accuracies in numerous heterogeneous scenarios.
Optimal Set-Membership Smoothing
This article studies the Set-Membership Smoothing (SMSing) problem for non-stochastic Hidden Markov Models. By adopting the mathematical concept of uncertain variables, an optimal SMSing framework is established for the first time. This optimal framework reveals the principles of SMSing and the relationship between set-membership filtering and smoothing. Based on the design principles, we put forward two SMSing algorithms: one for linear systems with zonotopic constrained uncertainties, where the solution is given in a closed form, and the other for a class of nonlinear systems. Numerical simulations corroborate the effectiveness of our theoretical results.
comment: 7 pages
Flexible Operation of Electricity-HCNG Networks with Variable Hydrogen Fraction: A Distributionally Robust Joint Chance-Constrained Approach
Hydrogen-enriched compressed natural gas (HCNG) is a promising way to utilize surplus renewable energy through hydrogen electrolysis and blending it into natural gas. However, the optimal hydrogen volume fraction (HVF) of HCNG varies following the daily fluctuations of renewable energy. Besides, facing the rapid volatility of renewable energy, ensuring rapid and reliable real-time adjustments is challenging for electricity-HCNG (E-HCNG) coupling networks. To this end, this paper proposes a flexible operation framework for electricity-HCNG (E-HCNG) networks against the fluctuations and volatility of renewable energy. Based on operations with variable HVF, the framework developed an E-HCNG system-level affine policy, which allows real-time re-dispatch of operations according to the volatility. Meanwhile, to guarantee the operational reliability of the affine policy, a distributionally robust joint chance constraint (DRJCC) is introduced, which limits the violation probability of operational constraints under the uncertainties of renewable energy volatility. Furthermore, in the solving process, to mitigate the over-conservation in DRJCC decomposition, an improved risk allocation method is proposed, utilizing the correlations among violations under the affine policy. Moreover, to tackle the non-convexities arising from the variable HVF, customized approximations for HCNG flow formulations are developed. The problem is finally reformulated into a mix-integer second-order cone programming problem. The effectiveness of the proposed method is validated both in small-scale and large-scale experiments.
Flying Quadrotors in Tight Formations using Learning-based Model Predictive Control
Flying quadrotors in tight formations is a challenging problem. It is known that in the near-field airflow of a quadrotor, the aerodynamic effects induced by the propellers are complex and difficult to characterize. Although machine learning tools can potentially be used to derive models that capture these effects, these data-driven approaches can be sample inefficient and the resulting models often do not generalize as well as their first-principles counterparts. In this work, we propose a framework that combines the benefits of first-principles modeling and data-driven approaches to construct an accurate and sample efficient representation of the complex aerodynamic effects resulting from quadrotors flying in formation. The data-driven component within our model is lightweight, making it amenable for optimization-based control design. Through simulations and physical experiments, we show that incorporating the model into a novel learning-based nonlinear model predictive control (MPC) framework results in substantial performance improvements in terms of trajectory tracking and disturbance rejection. In particular, our framework significantly outperforms nominal MPC in physical experiments, achieving a 40.1% improvement in the average trajectory tracking errors and a 57.5% reduction in the maximum vertical separation errors. Our framework also achieves exceptional sample efficiency, using only a total of 46 seconds of flight data for training across both simulations and physical experiments. Furthermore, with our proposed framework, the quadrotors achieve an exceptionally tight formation, flying with an average separation of less than 1.5 body lengths throughout the flight. A video illustrating our framework and physical experiments is given here: https://youtu.be/Hv-0JiVoJGo
comment: 7 pages, 5 figures
Generalization of Compositional Tasks with Logical Specification via Implicit Planning
In this work, we study the problem of learning generalizable policies for compositional tasks given by a logic specification. These tasks are composed by temporally extended subgoals. Due to dependencies of subgoals and long task horizon, previous reinforcement learning (RL) algorithms, e.g., task-conditioned and goal-conditioned policies, still suffer from slow convergence and sub-optimality when solving the generalization problem of compositional tasks. In order to tackle these issues, this paper proposes a new hierarchical RL framework for the efficient and optimal generalization of compositional tasks. In the high level, we propose a new implicit planner designed specifically for generalizing compositional tasks. Specifically, the planner produces the selection of next sub-task and estimates the multi-step return of completing the rest of task from current state. It learns a latent transition model and conducts planning in the latent space based on a graph neural network (GNN). Then, the next sub-task selected by the high level guides the low-level agent efficiently to solve long-horizon tasks and the multi-step return makes the low-level policy consider dependencies of future sub-tasks. We conduct comprehensive experiments to show the advantage of proposed framework over previous methods in terms of optimality and efficiency.
Integrating Reinforcement Learning and Large Language Models for Crop Production Process Management Optimization and Control through A New Knowledge-Based Deep Learning Paradigm
Efficient and sustainable crop production process management is crucial to meet the growing global demand for food, fuel, and feed while minimizing environmental impacts. Traditional crop management practices, often developed through empirical experience, face significant challenges in adapting to the dynamic nature of modern agriculture, which is influenced by factors such as climate change, soil variability, and market conditions. Recently, reinforcement learning (RL) and large language models (LLMs) bring transformative potential, with RL providing adaptive methodologies to learn optimal strategies and LLMs offering vast, superhuman knowledge across agricultural domains, enabling informed, context-specific decision-making. This paper systematically examines how the integration of RL and LLMs into crop management decision support systems (DSSs) can drive advancements in agricultural practice. We explore recent advancements in RL and LLM algorithms, their application within crop management, and the use of crop management simulators to develop these technologies. The convergence of RL and LLMs with crop management DSSs presents new opportunities to optimize agricultural practices through data-driven, adaptive solutions that can address the uncertainties and complexities of crop production. However, this integration also brings challenges, particularly in real-world deployment. We discuss these challenges and propose potential solutions, including the use of offline RL and enhanced LLM integration, to maximize the effectiveness and sustainability of crop management. Our findings emphasize the need for continued research and innovation to unlock the full potential of these advanced tools in transforming agricultural systems into optimal and controllable ones.
comment: 13 pages
Stability and Transparency in Mixed Reality Bilateral Human Teleoperation
Recent work introduced the concept of human teleoperation (HT), where the remote robot typically considered in conventional bilateral teleoperation is replaced by a novice person wearing a mixed reality head mounted display and tracking the motion of a virtual tool controlled by an expert. HT has advantages in cost, complexity, and patient acceptance for telemedicine in low-resource communities or remote locations. However, the stability, transparency, and performance of bilateral HT are unexplored. In this paper, we therefore develop a mathematical model and simulation of the HT system using test data. We then analyze various control architectures with this model and implement them with the HT system to find the achievable performance, investigate stability, and determine the most promising teleoperation scheme in the presence of time delays. We show that instability in HT, while not destructive or dangerous, makes the system impossible to use. However, stable and transparent teleoperation are possible with small time delays (<200 ms) through 3-channel teleoperation, or with large time delays through model-mediated teleoperation with local pose and force feedback for the novice.
Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space NeurIPS 2024
Even though a variety of methods have been proposed in the literature, efficient and effective latent-space control (i.e., control in a learned low-dimensional space) of physical systems remains an open challenge. We argue that a promising avenue is to leverage powerful and well-understood closed-form strategies from control theory literature in combination with learned dynamics, such as potential-energy shaping. We identify three fundamental shortcomings in existing latent-space models that have so far prevented this powerful combination: (i) they lack the mathematical structure of a physical system, (ii) they do not inherently conserve the stability properties of the real systems, (iii) these methods do not have an invertible mapping between input and latent-space forcing. This work proposes a novel Coupled Oscillator Network (CON) model that simultaneously tackles all these issues. More specifically, (i) we show analytically that CON is a Lagrangian system - i.e., it possesses well-defined potential and kinetic energy terms. Then, (ii) we provide formal proof of global Input-to-State stability using Lyapunov arguments. Moving to the experimental side, we demonstrate that CON reaches SoA performance when learning complex nonlinear dynamics of mechanical systems directly from images. An additional methodological innovation contributing to achieving this third goal is an approximated closed-form solution for efficient integration of network dynamics, which eases efficient training. We tackle (iii) by approximating the forcing-to-input mapping with a decoder that is trained to reconstruct the input based on the encoded latent space force. Finally, we show how these properties enable latent-space control. We use an integral-saturated PID with potential force compensation and demonstrate high-quality performance on a soft robot using raw pixels as the only feedback information.
comment: 38th Conference on Neural Information Processing Systems (NeurIPS 2024) spotlight, 49 pages
Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction
The control of dynamical systems under temporal logic specifications among uncontrollable dynamic agents is challenging due to the agents' a-priori unknown behavior. Existing works have considered the problem where either all agents are controllable, the agent models are deterministic and known, or no safety guarantees are provided. We propose a predictive control synthesis framework that guarantees, with high probability, the satisfaction of signal temporal logic (STL) tasks that are defined over a controllable system in the presence of uncontrollable stochastic agents. We use trajectory predictors and conformal prediction to construct probabilistic prediction regions for each uncontrollable agent that are valid over multiple future time steps. Specifically, we construct a normalized prediction region over all agents and time steps to reduce conservatism and increase data efficiency. We then formulate a worst-case bilevel mixed integer program (MIP) that accounts for all agent realizations within the prediction region to obtain an open-loop controller that provably guarantee task satisfaction with high probability. To efficiently solve this bilevel MIP, we propose an equivalent MIP program based on KKT conditions of the original bilevel formulation. Building upon this, we design a closed-loop controller, where both recursive feasibility and task satisfaction can be guaranteed with high probability. We illustrate our control synthesis framework on two case studies.
From Optimization to Control: Quasi Policy Iteration
Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we review this analogy across four problem classes with a unified solution characterization allowing for a systematic transformation of algorithms from one domain to the other. In particular, we identify equivalent optimization and control algorithms that have already been pointed out in the existing literature, but mostly in a scattered way. With this unifying framework in mind, we adopt the quasi-Newton method from convex optimization to introduce a novel control algorithm coined as quasi-policy iteration (QPI). In particular, QPI is based on a novel approximation of the "Hessian" matrix in the policy iteration algorithm by exploiting two linear structural constraints specific to MDPs and by allowing for the incorporation of prior information on the transition probability kernel. While the proposed algorithm has the same computational complexity as value iteration, it interestingly exhibits an empirical convergence behavior similar to policy iteration with a very low sensitivity to the discount factor.
TSViT: A Time Series Vision Transformer for Fault Diagnosis
Traditional fault diagnosis methods using Convolutional Neural Networks (CNNs) often struggle with capturing the temporal dynamics of vibration signals. To overcome this, the application of Transformer-based Vision Transformer (ViT) methods to fault diagnosis is gaining attraction. Nonetheless, these methods typically require extensive preprocessing, which increases computational complexity, potentially reducing the efficiency of the diagnosis process. Addressing this gap, this paper presents the Time Series Vision Transformer (TSViT), tailored for effective fault diagnosis. TSViT incorporates a convolutional layer to extract local features from vibration signals, alongside a transformer encoder to discern long-term temporal patterns. A thorough experimental comparison on three diverse datasets demonstrates TSViT's effectiveness and adaptability. Moreover, the paper delves into the influence of hyperparameter tuning on the model's performance, computational demand, and parameter count. Remarkably, TSViT achieves an unprecedented 100% average accuracy on two test sets and 99.99% on another, showcasing its exceptional diagnostic capabilities.
A Parameter Privacy-Preserving Strategy for Mixed-Autonomy Platoon Control
It has been demonstrated that leading cruise control (LCC) can improve the operation of mixed-autonomy platoons by allowing connected and automated vehicles (CAVs) to make longitudinal control decisions based on the information provided by surrounding vehicles. However, LCC generally requires surrounding human-driven vehicles (HDVs) to share their real-time states, which can be used by adversaries to infer drivers' car-following behavior, potentially leading to financial losses or safety concerns. This paper aims to address such privacy concerns and protect the behavioral characteristics of HDVs by devising a parameter privacy-preserving approach for mixed-autonomy platoon control. First, we integrate a parameter privacy filter into LCC to protect sensitive car-following parameters. The privacy filter allows each vehicle to generate seemingly realistic pseudo states by distorting the true parameters to pseudo parameters, which can protect drivers' privacy in behavioral parameters without significantly influencing the control performance. Second, to enhance the reliability and practicality of the privacy filter within LCC, we first introduce an individual-level parameter privacy preservation constraint to the privacy filter, focusing on the privacy level of each individual parameter pair. Subsequently, we extend the current approach to accommodate continuous parameter spaces through a neural network estimator. Third, analysis of head-to-tail string stability reveals the potential impact of privacy filters in degrading mixed traffic flow performance. Simulation shows that this approach can effectively trade off privacy and control performance in LCC. We further demonstrate the benefit of such an approach in networked systems, i.e., by applying the privacy filter to a preceding vehicle, one can also achieve a certain level of privacy for the following vehicle.
Towards a Deeper Understanding of Transformer for Residential Non-intrusive Load Monitoring
Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted to analyze the influence of these hyper-parameters in the context of residential NILM. This study delves into the effects of the number of hidden dimensions in the attention layer, the number of attention layers, the number of attention heads, and the dropout ratio on transformer performance. Furthermore, the role of the masking ratio has explored in BERT-style transformer training, providing a detailed investigation into its impact on NILM tasks. Based on these experiments, the optimal hyper-parameters have been selected and used them to train a transformer model, which surpasses the performance of existing models. The experimental findings offer valuable insights and guidelines for optimizing transformer architectures, aiming to enhance their effectiveness and efficiency in NILM applications. It is expected that this work will serve as a foundation for future research and development of more robust and capable transformer models for NILM.
comment: Accepted to 4th IEEE-ICISET
Systems and Control (EESS)
Efficient ICBased Solutions for Medical Devices and Automotive Radars
This thesis focuses on developing integrated circuit (IC) solutions for medical devices and automotive radars, and is divided into two main parts. Part One presents the design and evaluation of a miniaturized multi chip module (MCM) solution intended to deliver welldefined, charge balanced current stimuli directly to the inner ear. This section emphasizes the design of the supply chip, which includes a DC DC converter. It involves a comprehensive study aimed at optimizing and enhancing the efficiency of the design. Part Two investigates the fundamental principles of designing millimeter wave (mmWave) voltagecontrolled oscillators (VCOs). This section introduces a VCO with stateoftheart performance, showcasing advancements in mmWave technology. Overall, this thesis contributes to both the medical device field and automotive radar technology through innovative IC solutions.
comment: PhD thesis
Improving accuracy and convergence of federated learning edge computing methods for generalized DER forecasting applications in power grid NeurIPS 2022
This proposal aims to develop more accurate federated learning (FL) methods with faster convergence properties and lower communication requirements, specifically for forecasting distributed energy resources (DER) such as renewables, energy storage, and loads in modern, low-carbon power grids. This will be achieved by (i) leveraging recently developed extensions of FL such as hierarchical and iterative clustering to improve performance with non-IID data, (ii) experimenting with different types of FL global models well-suited to time-series data, and (iii) incorporating domain-specific knowledge from power systems to build more general FL frameworks and architectures that can be applied to diverse types of DERs beyond just load forecasting, and with heterogeneous clients.
comment: Presented at the NeurIPS 2022 Tackling Climate Change with Machine Learning workshop
FedECADO: A Dynamical System Model of Federated Learning
Federated learning harnesses the power of distributed optimization to train a unified machine learning model across separate clients. However, heterogeneous data distributions and computational workloads can lead to inconsistent updates and limit model performance. This work tackles these challenges by proposing FedECADO, a new algorithm inspired by a dynamical system representation of the federated learning process. FedECADO addresses non-IID data distribution through an aggregate sensitivity model that reflects the amount of data processed by each client. To tackle heterogeneous computing, we design a multi-rate integration method with adaptive step-size selections that synchronizes active client updates in continuous time. Compared to prominent techniques, including FedProx and FedNova, FedECADO achieves higher classification accuracies in numerous heterogeneous scenarios.
Optimal Set-Membership Smoothing
This article studies the Set-Membership Smoothing (SMSing) problem for non-stochastic Hidden Markov Models. By adopting the mathematical concept of uncertain variables, an optimal SMSing framework is established for the first time. This optimal framework reveals the principles of SMSing and the relationship between set-membership filtering and smoothing. Based on the design principles, we put forward two SMSing algorithms: one for linear systems with zonotopic constrained uncertainties, where the solution is given in a closed form, and the other for a class of nonlinear systems. Numerical simulations corroborate the effectiveness of our theoretical results.
comment: 7 pages
Flexible Operation of Electricity-HCNG Networks with Variable Hydrogen Fraction: A Distributionally Robust Joint Chance-Constrained Approach
Hydrogen-enriched compressed natural gas (HCNG) is a promising way to utilize surplus renewable energy through hydrogen electrolysis and blending it into natural gas. However, the optimal hydrogen volume fraction (HVF) of HCNG varies following the daily fluctuations of renewable energy. Besides, facing the rapid volatility of renewable energy, ensuring rapid and reliable real-time adjustments is challenging for electricity-HCNG (E-HCNG) coupling networks. To this end, this paper proposes a flexible operation framework for electricity-HCNG (E-HCNG) networks against the fluctuations and volatility of renewable energy. Based on operations with variable HVF, the framework developed an E-HCNG system-level affine policy, which allows real-time re-dispatch of operations according to the volatility. Meanwhile, to guarantee the operational reliability of the affine policy, a distributionally robust joint chance constraint (DRJCC) is introduced, which limits the violation probability of operational constraints under the uncertainties of renewable energy volatility. Furthermore, in the solving process, to mitigate the over-conservation in DRJCC decomposition, an improved risk allocation method is proposed, utilizing the correlations among violations under the affine policy. Moreover, to tackle the non-convexities arising from the variable HVF, customized approximations for HCNG flow formulations are developed. The problem is finally reformulated into a mix-integer second-order cone programming problem. The effectiveness of the proposed method is validated both in small-scale and large-scale experiments.
Flying Quadrotors in Tight Formations using Learning-based Model Predictive Control
Flying quadrotors in tight formations is a challenging problem. It is known that in the near-field airflow of a quadrotor, the aerodynamic effects induced by the propellers are complex and difficult to characterize. Although machine learning tools can potentially be used to derive models that capture these effects, these data-driven approaches can be sample inefficient and the resulting models often do not generalize as well as their first-principles counterparts. In this work, we propose a framework that combines the benefits of first-principles modeling and data-driven approaches to construct an accurate and sample efficient representation of the complex aerodynamic effects resulting from quadrotors flying in formation. The data-driven component within our model is lightweight, making it amenable for optimization-based control design. Through simulations and physical experiments, we show that incorporating the model into a novel learning-based nonlinear model predictive control (MPC) framework results in substantial performance improvements in terms of trajectory tracking and disturbance rejection. In particular, our framework significantly outperforms nominal MPC in physical experiments, achieving a 40.1% improvement in the average trajectory tracking errors and a 57.5% reduction in the maximum vertical separation errors. Our framework also achieves exceptional sample efficiency, using only a total of 46 seconds of flight data for training across both simulations and physical experiments. Furthermore, with our proposed framework, the quadrotors achieve an exceptionally tight formation, flying with an average separation of less than 1.5 body lengths throughout the flight. A video illustrating our framework and physical experiments is given here: https://youtu.be/Hv-0JiVoJGo
comment: 7 pages, 5 figures
Generalization of Compositional Tasks with Logical Specification via Implicit Planning
In this work, we study the problem of learning generalizable policies for compositional tasks given by a logic specification. These tasks are composed by temporally extended subgoals. Due to dependencies of subgoals and long task horizon, previous reinforcement learning (RL) algorithms, e.g., task-conditioned and goal-conditioned policies, still suffer from slow convergence and sub-optimality when solving the generalization problem of compositional tasks. In order to tackle these issues, this paper proposes a new hierarchical RL framework for the efficient and optimal generalization of compositional tasks. In the high level, we propose a new implicit planner designed specifically for generalizing compositional tasks. Specifically, the planner produces the selection of next sub-task and estimates the multi-step return of completing the rest of task from current state. It learns a latent transition model and conducts planning in the latent space based on a graph neural network (GNN). Then, the next sub-task selected by the high level guides the low-level agent efficiently to solve long-horizon tasks and the multi-step return makes the low-level policy consider dependencies of future sub-tasks. We conduct comprehensive experiments to show the advantage of proposed framework over previous methods in terms of optimality and efficiency.
Integrating Reinforcement Learning and Large Language Models for Crop Production Process Management Optimization and Control through A New Knowledge-Based Deep Learning Paradigm
Efficient and sustainable crop production process management is crucial to meet the growing global demand for food, fuel, and feed while minimizing environmental impacts. Traditional crop management practices, often developed through empirical experience, face significant challenges in adapting to the dynamic nature of modern agriculture, which is influenced by factors such as climate change, soil variability, and market conditions. Recently, reinforcement learning (RL) and large language models (LLMs) bring transformative potential, with RL providing adaptive methodologies to learn optimal strategies and LLMs offering vast, superhuman knowledge across agricultural domains, enabling informed, context-specific decision-making. This paper systematically examines how the integration of RL and LLMs into crop management decision support systems (DSSs) can drive advancements in agricultural practice. We explore recent advancements in RL and LLM algorithms, their application within crop management, and the use of crop management simulators to develop these technologies. The convergence of RL and LLMs with crop management DSSs presents new opportunities to optimize agricultural practices through data-driven, adaptive solutions that can address the uncertainties and complexities of crop production. However, this integration also brings challenges, particularly in real-world deployment. We discuss these challenges and propose potential solutions, including the use of offline RL and enhanced LLM integration, to maximize the effectiveness and sustainability of crop management. Our findings emphasize the need for continued research and innovation to unlock the full potential of these advanced tools in transforming agricultural systems into optimal and controllable ones.
comment: 13 pages
Stability and Transparency in Mixed Reality Bilateral Human Teleoperation
Recent work introduced the concept of human teleoperation (HT), where the remote robot typically considered in conventional bilateral teleoperation is replaced by a novice person wearing a mixed reality head mounted display and tracking the motion of a virtual tool controlled by an expert. HT has advantages in cost, complexity, and patient acceptance for telemedicine in low-resource communities or remote locations. However, the stability, transparency, and performance of bilateral HT are unexplored. In this paper, we therefore develop a mathematical model and simulation of the HT system using test data. We then analyze various control architectures with this model and implement them with the HT system to find the achievable performance, investigate stability, and determine the most promising teleoperation scheme in the presence of time delays. We show that instability in HT, while not destructive or dangerous, makes the system impossible to use. However, stable and transparent teleoperation are possible with small time delays (<200 ms) through 3-channel teleoperation, or with large time delays through model-mediated teleoperation with local pose and force feedback for the novice.
Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space NeurIPS 2024
Even though a variety of methods have been proposed in the literature, efficient and effective latent-space control (i.e., control in a learned low-dimensional space) of physical systems remains an open challenge. We argue that a promising avenue is to leverage powerful and well-understood closed-form strategies from control theory literature in combination with learned dynamics, such as potential-energy shaping. We identify three fundamental shortcomings in existing latent-space models that have so far prevented this powerful combination: (i) they lack the mathematical structure of a physical system, (ii) they do not inherently conserve the stability properties of the real systems, (iii) these methods do not have an invertible mapping between input and latent-space forcing. This work proposes a novel Coupled Oscillator Network (CON) model that simultaneously tackles all these issues. More specifically, (i) we show analytically that CON is a Lagrangian system - i.e., it possesses well-defined potential and kinetic energy terms. Then, (ii) we provide formal proof of global Input-to-State stability using Lyapunov arguments. Moving to the experimental side, we demonstrate that CON reaches SoA performance when learning complex nonlinear dynamics of mechanical systems directly from images. An additional methodological innovation contributing to achieving this third goal is an approximated closed-form solution for efficient integration of network dynamics, which eases efficient training. We tackle (iii) by approximating the forcing-to-input mapping with a decoder that is trained to reconstruct the input based on the encoded latent space force. Finally, we show how these properties enable latent-space control. We use an integral-saturated PID with potential force compensation and demonstrate high-quality performance on a soft robot using raw pixels as the only feedback information.
comment: 38th Conference on Neural Information Processing Systems (NeurIPS 2024) spotlight, 49 pages
Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction
The control of dynamical systems under temporal logic specifications among uncontrollable dynamic agents is challenging due to the agents' a-priori unknown behavior. Existing works have considered the problem where either all agents are controllable, the agent models are deterministic and known, or no safety guarantees are provided. We propose a predictive control synthesis framework that guarantees, with high probability, the satisfaction of signal temporal logic (STL) tasks that are defined over a controllable system in the presence of uncontrollable stochastic agents. We use trajectory predictors and conformal prediction to construct probabilistic prediction regions for each uncontrollable agent that are valid over multiple future time steps. Specifically, we construct a normalized prediction region over all agents and time steps to reduce conservatism and increase data efficiency. We then formulate a worst-case bilevel mixed integer program (MIP) that accounts for all agent realizations within the prediction region to obtain an open-loop controller that provably guarantee task satisfaction with high probability. To efficiently solve this bilevel MIP, we propose an equivalent MIP program based on KKT conditions of the original bilevel formulation. Building upon this, we design a closed-loop controller, where both recursive feasibility and task satisfaction can be guaranteed with high probability. We illustrate our control synthesis framework on two case studies.
From Optimization to Control: Quasi Policy Iteration
Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we review this analogy across four problem classes with a unified solution characterization allowing for a systematic transformation of algorithms from one domain to the other. In particular, we identify equivalent optimization and control algorithms that have already been pointed out in the existing literature, but mostly in a scattered way. With this unifying framework in mind, we adopt the quasi-Newton method from convex optimization to introduce a novel control algorithm coined as quasi-policy iteration (QPI). In particular, QPI is based on a novel approximation of the "Hessian" matrix in the policy iteration algorithm by exploiting two linear structural constraints specific to MDPs and by allowing for the incorporation of prior information on the transition probability kernel. While the proposed algorithm has the same computational complexity as value iteration, it interestingly exhibits an empirical convergence behavior similar to policy iteration with a very low sensitivity to the discount factor.
TSViT: A Time Series Vision Transformer for Fault Diagnosis
Traditional fault diagnosis methods using Convolutional Neural Networks (CNNs) often struggle with capturing the temporal dynamics of vibration signals. To overcome this, the application of Transformer-based Vision Transformer (ViT) methods to fault diagnosis is gaining attraction. Nonetheless, these methods typically require extensive preprocessing, which increases computational complexity, potentially reducing the efficiency of the diagnosis process. Addressing this gap, this paper presents the Time Series Vision Transformer (TSViT), tailored for effective fault diagnosis. TSViT incorporates a convolutional layer to extract local features from vibration signals, alongside a transformer encoder to discern long-term temporal patterns. A thorough experimental comparison on three diverse datasets demonstrates TSViT's effectiveness and adaptability. Moreover, the paper delves into the influence of hyperparameter tuning on the model's performance, computational demand, and parameter count. Remarkably, TSViT achieves an unprecedented 100% average accuracy on two test sets and 99.99% on another, showcasing its exceptional diagnostic capabilities.
A Parameter Privacy-Preserving Strategy for Mixed-Autonomy Platoon Control
It has been demonstrated that leading cruise control (LCC) can improve the operation of mixed-autonomy platoons by allowing connected and automated vehicles (CAVs) to make longitudinal control decisions based on the information provided by surrounding vehicles. However, LCC generally requires surrounding human-driven vehicles (HDVs) to share their real-time states, which can be used by adversaries to infer drivers' car-following behavior, potentially leading to financial losses or safety concerns. This paper aims to address such privacy concerns and protect the behavioral characteristics of HDVs by devising a parameter privacy-preserving approach for mixed-autonomy platoon control. First, we integrate a parameter privacy filter into LCC to protect sensitive car-following parameters. The privacy filter allows each vehicle to generate seemingly realistic pseudo states by distorting the true parameters to pseudo parameters, which can protect drivers' privacy in behavioral parameters without significantly influencing the control performance. Second, to enhance the reliability and practicality of the privacy filter within LCC, we first introduce an individual-level parameter privacy preservation constraint to the privacy filter, focusing on the privacy level of each individual parameter pair. Subsequently, we extend the current approach to accommodate continuous parameter spaces through a neural network estimator. Third, analysis of head-to-tail string stability reveals the potential impact of privacy filters in degrading mixed traffic flow performance. Simulation shows that this approach can effectively trade off privacy and control performance in LCC. We further demonstrate the benefit of such an approach in networked systems, i.e., by applying the privacy filter to a preceding vehicle, one can also achieve a certain level of privacy for the following vehicle.
Towards a Deeper Understanding of Transformer for Residential Non-intrusive Load Monitoring
Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted to analyze the influence of these hyper-parameters in the context of residential NILM. This study delves into the effects of the number of hidden dimensions in the attention layer, the number of attention layers, the number of attention heads, and the dropout ratio on transformer performance. Furthermore, the role of the masking ratio has explored in BERT-style transformer training, providing a detailed investigation into its impact on NILM tasks. Based on these experiments, the optimal hyper-parameters have been selected and used them to train a transformer model, which surpasses the performance of existing models. The experimental findings offer valuable insights and guidelines for optimizing transformer architectures, aiming to enhance their effectiveness and efficiency in NILM applications. It is expected that this work will serve as a foundation for future research and development of more robust and capable transformer models for NILM.
comment: Accepted to 4th IEEE-ICISET
Robotics
Geometric Optimal Control of Mechanical Systems with Gravitational and Resistive Force ICRA
Optimal control plays a crucial role in numerous mechanical and robotic applications. Broadly, optimal control methods are divided into direct methods (which optimize trajectories directly via discretization) and indirect methods (which transform optimality conditions into equations that guarantee optimal trajectories). While direct methods could mask geometric insights into system dynamics due to discretization, indirect methods offer a deeper understanding of the system's geometry. In this paper, we propose a geometric framework for understanding optimal control in mechanical systems, focusing on the combined effects of inertia, drag, and gravitational forces. By modeling mechanical systems as configuration manifolds equipped with kinetic and drag metrics, alongside a potential field, we explore how these factors influence trajectory optimization. We derive optimal control equations incorporating these effects and apply them to two-link and UR5 robotic manipulators, demonstrating how manifold curvature and resistive forces shape optimal trajectories. This work offers a comprehensive geometric approach to optimal control, with broad applications to robotic systems.
comment: 6 pages, submitted to The International Conference on Robotics and Automation (ICRA)
A Collaborative Team of UAV-Hexapod for an Autonomous Retrieval System in GNSS-Denied Maritime Environments
We present an integrated UAV-hexapod robotic system designed for GNSS-denied maritime operations, capable of autonomous deployment and retrieval of a hexapod robot via a winch mechanism installed on a UAV. This system is intended to address the challenges of localization, control, and mobility in dynamic maritime environments. Our solution leverages sensor fusion techniques, combining optical flow, LiDAR, and depth data for precise localization. Experimental results demonstrate the effectiveness of this system in real-world scenarios, validating its performance during field tests in both controlled and operational conditions in the MBZIRC 2023 Maritime Challenge.
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Embodied artificial intelligence emphasizes the role of an agent's body in generating human-like behaviors. The recent efforts on EmbodiedAI pay a lot of attention to building up machine learning models to possess perceiving, planning, and acting abilities, thereby enabling real-time interaction with the world. However, most works focus on bounded indoor environments, such as navigation in a room or manipulating a device, with limited exploration of embodying the agents in open-world scenarios. That is, embodied intelligence in the open and outdoor environment is less explored, for which one potential reason is the lack of high-quality simulators, benchmarks, and datasets. To address it, in this paper, we construct a benchmark platform for embodied intelligence evaluation in real-world city environments. Specifically, we first construct a highly realistic 3D simulation environment based on the real buildings, roads, and other elements in a real city. In this environment, we combine historically collected data and simulation algorithms to conduct simulations of pedestrian and vehicle flows with high fidelity. Further, we designed a set of evaluation tasks covering different EmbodiedAI abilities. Moreover, we provide a complete set of input and output interfaces for access, enabling embodied agents to easily take task requirements and current environmental observations as input and then make decisions and obtain performance evaluations. On the one hand, it expands the capability of existing embodied intelligence to higher levels. On the other hand, it has a higher practical value in the real world and can support more potential applications for artificial general intelligence. Based on this platform, we evaluate some popular large language models for embodied intelligence capabilities of different dimensions and difficulties.
comment: All of the software, Python library, codes, datasets, tutorials, and real-time online service are available on this website: https://embodied-city.fiblab.net
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
This work introduces Transformer-based Off-Policy Episodic Reinforcement Learning (TOP-ERL), a novel algorithm that enables off-policy updates in the ERL framework. In ERL, policies predict entire action trajectories over multiple time steps instead of single actions at every time step. These trajectories are typically parameterized by trajectory generators such as Movement Primitives (MP), allowing for smooth and efficient exploration over long horizons while capturing high-level temporal correlations. However, ERL methods are often constrained to on-policy frameworks due to the difficulty of evaluating state-action values for entire action sequences, limiting their sample efficiency and preventing the use of more efficient off-policy architectures. TOP-ERL addresses this shortcoming by segmenting long action sequences and estimating the state-action values for each segment using a transformer-based critic architecture alongside an n-step return estimation. These contributions result in efficient and stable training that is reflected in the empirical results conducted on sophisticated robot learning environments. TOP-ERL significantly outperforms state-of-the-art RL methods. Thorough ablation studies additionally show the impact of key design choices on the model performance.
Towards Design and Development of a Low-Cost Unmanned Surface Vehicle for Aquaculture Water Quality Monitoring in Shallow Water Environments
Unmanned surface vessels USVs are typically autonomous or remotely operated and are specifically designed for environmental monitoring in various aquatic environments Aquaculture requires constant monitoring and management of water quality for the health and productivity of aquaculture systems Poor water quality can lead to disease outbreaks reduced growth rates and even mass mortality of cultured species Many small aquaculture operations operate on tight budgets and in shallow water environments such as inland ponds coastal lagoons estuaries and shallow rivers particularly in developing regions This leads to the foremost manoeuvrability challenge underscoring the crucial need for agile cost effective USVs as efficient monitoring systems The paper proposes a low cost 3D printed twin hull catamaran style platform equipped with an Inertial Measurement Unit IMU and a Global Navigation Satellite System GNSS with a two layered control framework and a differential drive configuration developed using two high efficiency T200 thrusters The design utilizes the Robot Operating System ROS to create the control framework and incorporates Extended Kalman Filter EKF based sensor fusion techniques for localisation The paper evaluates the USVs autonomy through open water captive model experiments employing remote control methods to assess the vessels manoeuvrability and overall performance in shallow water conditions
The Indirect Method for Generating Libraries of Optimal Periodic Trajectories and Its Application to Economical Bipedal Walking
Trajectory optimization is an essential tool for generating efficient and dynamically consistent gaits in legged locomotion. This paper explores the indirect method of trajectory optimization, emphasizing its application in creating optimal periodic gaits for legged systems and contrasting it with the more commonly used direct method. While the direct method provides considerable flexibility in its implementation, it is limited by its input space parameterization. In contrast, the indirect method improves accuracy by defining control inputs as functions of the system's states and costates. We tackle the convergence challenges associated with indirect shooting methods, particularly through the systematic development of gait libraries by utilizing numerical continuation methods. Our contributions include: (1) the formalization of a general periodic trajectory optimization problem that extends existing first-order necessary conditions for a broader range of cost functions and operating conditions; (2) a methodology for efficiently generating libraries of optimal trajectories (gaits) utilizing a single shooting approach combined with numerical continuation methods, including a novel approach for reconstructing Lagrange multipliers and costates from passive gaits; and (3) a comparative analysis of the indirect and direct shooting methods using a compass-gait walker as a case study, demonstrating the former's superior accuracy in generating optimal gaits. The findings underscore the potential of the indirect method for generating families of optimal gaits, thereby advancing the field of trajectory optimization in legged robotics.
comment: submitted to the International Journal of Robotics Research (IJRR)
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
Reinforcement learning (RL) is ubiquitous in the development of modern AI systems. However, state-of-the-art RL agents require extensive, and potentially unsafe, interactions with their environments to learn effectively. These limitations confine RL agents to simulated environments, hindering their ability to learn directly in real-world settings. In this work, we present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. ActSafe learns a well-calibrated probabilistic model of the system and plans optimistically w.r.t. the epistemic uncertainty about the unknown dynamics, while enforcing pessimism w.r.t. the safety constraints. Under regularity assumptions on the constraints and dynamics, we show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements and enables safe exploration even in high-dimensional settings such as visual control. We empirically show that ActSafe obtains state-of-the-art performance in difficult exploration tasks on standard safe deep RL benchmarks while ensuring safety during learning.
An Expeditious Spatial Mean Radiant Temperature Mapping Framework using Visual SLAM and Semantic Segmentation
Ensuring thermal comfort is essential for the well-being and productivity of individuals in built environments. Of the various thermal comfort indicators, the mean radiant temperature (MRT) is very challenging to measure. Most common measurement methodologies are time-consuming and not user-friendly. To address this issue, this paper proposes a novel MRT measurement framework that uses visual simultaneous localization and mapping (SLAM) and semantic segmentation techniques. The proposed approach follows the rule of thumb of the traditional MRT calculation method using surface temperature and view factors. However, it employs visual SLAM and creates a 3D thermal point cloud with enriched surface temperature information. The framework then implements Grounded SAM, a new object detection and segmentation tool to extract features with distinct temperature profiles on building surfaces. The detailed segmentation of thermal features not only reduces potential errors in the calculation of the MRT but also provides an efficient reconstruction of the spatial MRT distribution in the indoor environment. We also validate the calculation results with the reference measurement methodology. This data-driven framework offers faster and more efficient MRT measurements and spatial mapping than conventional methods. It can enable the direct engagement of researchers and practitioners in MRT measurements and contribute to research on thermal comfort and radiant cooling and heating systems.
comment: Accepted by 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop
REGNet V2: End-to-End REgion-based Grasp Detection Network for Grippers of Different Sizes in Point Clouds
Grasping has been a crucial but challenging problem in robotics for many years. One of the most important challenges is how to make grasping generalizable and robust to novel objects as well as grippers in unstructured environments. We present \regnet, a robotic grasping system that can adapt to different parallel jaws to grasp diversified objects. To support different grippers, \regnet embeds the gripper parameters into point clouds, based on which it predicts suitable grasp configurations. It includes three components: Score Network (SN), Grasp Region Network (GRN), and Refine Network (RN). In the first stage, SN is used to filter suitable points for grasping by grasp confidence scores. In the second stage, based on the selected points, GRN generates a set of grasp proposals. Finally, RN refines the grasp proposals for more accurate and robust predictions. We devise an analytic policy to choose the optimal grasp to be executed from the predicted grasp set. To train \regnet, we construct a large-scale grasp dataset containing collision-free grasp configurations using different parallel-jaw grippers. The experimental results demonstrate that \regnet with the analytic policy achieves the highest success rate of $74.98\%$ in real-world clutter scenes with $20$ objects, significantly outperforming several state-of-the-art methods, including GPD, PointNetGPD, and S4G. The code and dataset are available at https://github.com/zhaobinglei/REGNet-V2.
ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras
Event-based visual odometry is a specific branch of visual Simultaneous Localization and Mapping (SLAM) techniques, which aims at solving tracking and mapping sub-problems in parallel by exploiting the special working principles of neuromorphic (ie, event-based) cameras. Due to the motion-dependent nature of event data, explicit data association ie, feature matching under large-baseline view-point changes is hardly established, making direct methods a more rational choice. However, state-of-the-art direct methods are limited by the high computational complexity of the mapping sub-problem and the degeneracy of camera pose tracking in certain degrees of freedom (DoF) in rotation. In this paper, we resolve these issues by building an event-based stereo visual-inertial odometry system on top of our previous direct pipeline Event-based Stereo Visual Odometry. Specifically, to speed up the mapping operation, we propose an efficient strategy for sampling contour points according to the local dynamics of events. The mapping performance is also improved in terms of structure completeness and local smoothness by merging the temporal stereo and static stereo results. To circumvent the degeneracy of camera pose tracking in recovering the pitch and yaw components of general six-DoF motion, we introduce IMU measurements as motion priors via pre-integration. To this end, a compact back-end is proposed for continuously updating the IMU bias and predicting the linear velocity, enabling an accurate motion prediction for camera pose tracking. The resulting system scales well with modern high-resolution event cameras and leads to better global positioning accuracy in large-scale outdoor environments. Extensive evaluations on five publicly available datasets featuring different resolutions and scenarios justify the superior performance of the proposed system against five state-of-the-art methods.
A Novel Multi-Gait Strategy for Stable and Efficient Quadruped Robot Locomotion
Taking inspiration from the natural gait transition mechanism of quadrupeds, devising a good gait transition strategy is important for quadruped robots to achieve energy-efficient locomotion on various terrains and velocities. While previous studies have recognized that gait patterns linked to velocities impact two key factors, the Cost of Transport (CoT) and the stability of robot locomotion, only a limited number of studies have effectively combined these factors to design a mechanism that ensures both efficiency and stability in quadruped robot locomotion. In this paper, we propose a multi-gait selection and transition strategy to achieve stable and efficient locomotion across different terrains. Our strategy starts by establishing a gait mapping considering both CoT and locomotion stability to guide the gait selection process during locomotion. Then, we achieve gait switching in time by introducing affine transformations for gait parameters and a designed finite state machine to build the switching order. Comprehensive experiments have been conducted on using our strategy with changing terrains and velocities, and the results indicate that our proposed strategy outperforms baseline methods in achieving simultaneous efficiency in locomotion by considering CoT and stability.
Adaptive Compliance Policy: Learning Approximate Compliance for Diffusion Guided Control
Compliance plays a crucial role in manipulation, as it balances between the concurrent control of position and force under uncertainties. Yet compliance is often overlooked by today's visuomotor policies that solely focus on position control. This paper introduces Adaptive Compliance Policy (ACP), a novel framework that learns to dynamically adjust system compliance both spatially and temporally for given manipulation tasks from human demonstrations, improving upon previous approaches that rely on pre-selected compliance parameters or assume uniform constant stiffness. However, computing full compliance parameters from human demonstrations is an ill-defined problem. Instead, we estimate an approximate compliance profile with two useful properties: avoiding large contact forces and encouraging accurate tracking. Our approach enables robots to handle complex contact-rich manipulation tasks and achieves over 50\% performance improvement compared to state-of-the-art visuomotor policy methods. For result videos, see https://adaptive-compliance.github.io/
ACDC: Automated Creation of Digital Cousins for Robust Policy Learning
Training robot policies in the real world can be unsafe, costly, and difficult to scale. Simulation serves as an inexpensive and potentially limitless source of training data, but suffers from the semantics and physics disparity beween simulated and real-world environments. These discrepancies can be minimized by training in digital twins,which serve as virtual replicas of a real scene but are expensive to generate and cannot produce cross-domain generalization. To address these limitations, we propose the concept of digital cousins, a virtual asset or scene that, unlike a digital twin,does not explicitly model a real-world counterpart but still exhibits similar geometric and semantic affordances. As a result, digital cousins simultaneously reduce the cost of generating an analogous virtual environment while also facilitating better robustness during sim-to-real domain transfer by providing a distribution of similar training scenes. Leveraging digital cousins, we introduce a novel method for the Automatic Creation of Digital Cousins (ACDC), and propose a fully automated real-to-sim-to-real pipeline for generating fully interactive scenes and training robot policies that can be deployed zero-shot in the original scene. We find that ACDC can produce digital cousin scenes that preserve geometric and semantic affordances, and can be used to train policies that outperform policies trained on digital twins, achieving 90% vs. 25% under zero-shot sim-to-real transfer. Additional details are available at https://digital-cousins.github.io/.
comment: CoRL 2024
Autonomous Driving in Unstructured Environments: How Far Have We Come?
Research on autonomous driving in unstructured outdoor environments is less advanced than in structured urban settings due to challenges like environmental diversities and scene complexity. These environments-such as rural areas and rugged terrains-pose unique obstacles that are not common in structured urban areas. Despite these difficulties, autonomous driving in unstructured outdoor environments is crucial for applications in agriculture, mining, and military operations. Our survey reviews over 250 papers for autonomous driving in unstructured outdoor environments, covering offline mapping, pose estimation, environmental perception, path planning, end-to-end autonomous driving, datasets, and relevant challenges. We also discuss emerging trends and future research directions. This review aims to consolidate knowledge and encourage further research for autonomous driving in unstructured environments. To support ongoing work, we maintain an active repository with up-to-date literature and open-source projects at: https://github.com/chaytonmin/Survey-Autonomous-Driving-in-Unstructured-Environments.
comment: Survey paper; 38 pages
Distributed Optimization Methods for Multi-Robot Systems: Part II -- A Survey
Although the field of distributed optimization is well-developed, relevant literature focused on the application of distributed optimization to multi-robot problems is limited. This survey constitutes the second part of a two-part series on distributed optimization applied to multi-robot problems. In this paper, we survey three main classes of distributed optimization algorithms -- distributed first-order methods, distributed sequential convex programming methods, and alternating direction method of multipliers (ADMM) methods -- focusing on fully-distributed methods that do not require coordination or computation by a central computer. We describe the fundamental structure of each category and note important variations around this structure, designed to address its associated drawbacks. Further, we provide practical implications of noteworthy assumptions made by distributed optimization algorithms, noting the classes of robotics problems suitable for these algorithms. Moreover, we identify important open research challenges in distributed optimization, specifically for robotics problems.
comment: arXiv admin note: substantial text overlap with arXiv:2103.12840
Distributed Optimization Methods for Multi-Robot Systems: Part I -- A Tutorial
Distributed optimization provides a framework for deriving distributed algorithms for a variety of multi-robot problems. This tutorial constitutes the first part of a two-part series on distributed optimization applied to multi-robot problems, which seeks to advance the application of distributed optimization in robotics. In this tutorial, we demonstrate that many canonical multi-robot problems can be cast within the distributed optimization framework, such as multi-robot simultaneous localization and planning (SLAM), multi-robot target tracking, and multi-robot task assignment problems. We identify three broad categories of distributed optimization algorithms: distributed first-order methods, distributed sequential convex programming, and the alternating direction method of multipliers (ADMM). We describe the basic structure of each category and provide representative algorithms within each category. We then work through a simulation case study of multiple drones collaboratively tracking a ground vehicle. We compare solutions to this problem using a number of different distributed optimization algorithms. In addition, we implement a distributed optimization algorithm in hardware on a network of Rasberry Pis communicating with XBee modules to illustrate robustness to the challenges of real-world communication networks.
DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Object navigation in unknown environments is crucial for deploying embodied agents in real-world applications. While we have witnessed huge progress due to large-scale scene datasets, faster simulators, and stronger models, previous studies mainly focus on limited scene types and target objects. In this paper, we study a new task of navigating to diverse target objects in a large number of scene types. To benchmark the problem, we present a large-scale scene dataset, DivScene, which contains 4,614 scenes across 81 different types. With the dataset, we build an end-to-end embodied agent, NatVLM, by fine-tuning a Large Vision Language Model (LVLM) through imitation learning. The LVLM is trained to take previous observations from the environment and generate the next actions. We also introduce CoT explanation traces of the action prediction for better performance when tuning LVLMs. Our extensive experiments find that we can build a performant LVLM-based agent through imitation learning on the shortest paths constructed by a BFS planner without any human supervision. Our agent achieves a success rate that surpasses GPT-4o by over 20%. Meanwhile, we carry out various analyses showing the generalization ability of our agent. Our code and data are available at https://github.com/zhaowei-wang-nlp/DivScene.
comment: Work in Progress
Relevance for Human Robot Collaboration
Effective human-robot collaboration (HRC) requires the robots to possess human-like intelligence. Inspired by the human's cognitive ability to selectively process and filter elements in complex environments, this paper introduces a novel concept and scene-understanding approach termed `relevance.' It identifies relevant components in a scene. To accurately and efficiently quantify relevance, we developed an event-based framework that selectively triggers relevance determination, along with a probabilistic methodology built on a structured scene representation. Simulation results demonstrate that the relevance framework and methodology accurately predict the relevance of a general HRC setup, achieving a precision of 0.99 and a recall of 0.94. Relevance can be broadly applied to several areas in HRC to improve task planning time by 79.56% compared with pure planning for a cereal task, reduce perception latency by up to 26.53% for an object detector, improve HRC safety by up to 13.50% and reduce the number of inquiries for HRC by 80.84%. A real-world demonstration showcases the relevance framework's ability to intelligently assist humans in everyday tasks.
One-Shot Imitation under Mismatched Execution
Human demonstrations as prompts are a powerful way to program robots to do long-horizon manipulation tasks. However, translating these demonstrations into robot-executable actions presents significant challenges due to execution mismatches in movement styles and physical capabilities. Existing methods either depend on robot-demonstrator paired data, which is infeasible to scale, or rely too heavily on frame-level visual similarities that often break down in practice. To address these challenges, we propose RHyME, a novel framework that automatically aligns robot and demonstrator task executions using optimal transport costs. Given long-horizon robot demonstrations, RHyME synthesizes semantically equivalent demonstrator videos by retrieving and composing short-horizon demonstrator clips. This approach facilitates effective policy training without the need for paired data. We demonstrate that RHyME outperforms a range of baselines across cross-embodiment datasets, showing a 52% increase in task recall over prior cross-embodiment learning methods. We release our code and datasets at https://portal-cornell.github.io/rhyme/.
Modeling and In-flight Torso Attitude Stabilization of a Jumping Quadruped
This paper addresses the modeling and attitude control of jumping quadrupeds in low-gravity environments. First, a convex decomposition procedure is presented to generate high-accuracy and low-cost collision geometries for quadrupeds performing agile maneuvers. A hierarchical control architecture is then investigated, separating torso orientation tracking from the generation of suitable, collision-free, corresponding leg motions. Nonlinear Model Predictive Controllers (NMPCs) are utilized in both layers of the controller. To compute the necessary leg motions, a torque allocation strategy is employed that leverages the symmetries of the system to avoid self-collisions and simplify the respective NMPC. To plan periodic trajectories online, a Finite State Machine (FSM)-based weight switching strategy is also used. The proposed controller is first evaluated in simulation, where 90 degree rotations in roll, pitch, and yaw are stabilized in 6.3, 2.4, and 5.5 seconds, respectively. The performance of the controller is further experimentally demonstrated by stabilizing constant and changing orientation references. Overall, this work provides a framework for the development of advanced model-based attitude controllers for jumping legged systems.
comment: 16 pages, 10 figures, to appear at the International Symposium of Robotics Research (ISRR) 2024. Paper site: https://michalispapadakis.github.io/mpc_olympus/
Motion Manifold Flow Primitives for Language-Guided Trajectory Generation
Developing text-based robot trajectory generation models is made particularly difficult by the small dataset size, high dimensionality of the trajectory space, and the inherent complexity of the text-conditional motion distribution. Recent manifold learning-based methods have partially addressed the dimensionality and dataset size issues, but struggle with the complex text-conditional distribution. In this paper we propose a text-based trajectory generation model that attempts to address all three challenges while relying on only a handful of demonstration trajectory data. Our key idea is to leverage recent flow-based models capable of capturing complex conditional distributions, not directly in the high-dimensional trajectory space, but rather in the low-dimensional latent coordinate space of the motion manifold, with deliberately designed regularization terms to ensure smoothness of motions and robustness to text variations. We show that our Motion Manifold Flow Primitive (MMFP) framework can accurately generate qualitatively distinct motions for a wide range of text inputs, significantly outperforming existing methods.
comment: 12 pages, 15 figures, under review
Hybrid Feedback for Three-dimensional Convex Obstacle Avoidance (Extended version)
We propose a hybrid feedback control scheme for the autonomous robot navigation problem in three-dimensional environments with arbitrarily-shaped convex obstacles. The proposed hybrid control strategy, which consists in switching between the move-to-target mode and the obstacle-avoidance mode, guarantees global asymptotic stability of the target location in the obstacle-free workspace. We also provide a procedure for the implementation of the proposed hybrid controller in a priori unknown environments and validate its effectiveness through simulation results.
comment: 11 pages, 3 figures
Autoregressive Action Sequence Learning for Robotic Manipulation
Autoregressive models have demonstrated remarkable success in natural language processing. In this work, we design a simple yet effective autoregressive architecture for robotic manipulation tasks. We propose the Chunking Causal Transformer (CCT), which extends the next-single-token prediction of causal transformers to support multi-token prediction in a single pass. Further, we design a novel attention interleaving strategy that allows CCT to be trained efficiently with teacher-forcing. Based on CCT, we propose the Autoregressive Policy (ARP) model, which learns to generate action sequences autoregressively. We find that action sequence learning enables better leverage of the underlying causal relationships in robotic tasks. We evaluate ARP across diverse robotic manipulation environments, including Push-T, ALOHA, and RLBench, and show that it outperforms the state-of-the-art methods in all tested environments, while being more efficient in computation and parameter sizes. Video demonstrations, our source code, and the models of ARP can be found at http://github.com/mlzxy/arp.
Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs SP
We present a novel approach for long-term human trajectory prediction in indoor human-centric environments, which is essential for long-horizon robot planning in these environments. State-of-the-art human trajectory prediction methods are limited by their focus on collision avoidance and short-term planning, and their inability to model complex interactions of humans with the environment. In contrast, our approach overcomes these limitations by predicting sequences of human interactions with the environment and using this information to guide trajectory predictions over a horizon of up to 60s. We leverage Large Language Models (LLMs) to predict interactions with the environment by conditioning the LLM prediction on rich contextual information about the scene. This information is given as a 3D Dynamic Scene Graph that encodes the geometry, semantics, and traversability of the environment into a hierarchical representation. We then ground these interaction sequences into multi-modal spatio-temporal distributions over human positions using a probabilistic approach based on continuous-time Markov Chains. To evaluate our approach, we introduce a new semi-synthetic dataset of long-term human trajectories in complex indoor environments, which also includes annotations of human-object interactions. We show in thorough experimental evaluations that our approach achieves a 54% lower average negative log-likelihood and a 26.5% lower Best-of-20 displacement error compared to the best non-privileged (i.e., evaluated in a zero-shot fashion on the dataset) baselines for a time horizon of 60s.
comment: 8 pages, 6 figures. Code released at: https://github.com/MIT-SPARK/LP2
DROP: Dexterous Reorientation via Online Planning ICRA 2025
Achieving human-like dexterity is a longstanding challenge in robotics, in part due to the complexity of planning and control for contact-rich systems. In reinforcement learning (RL), one popular approach has been to use massively-parallelized, domain-randomized simulations to learn a policy offline over a vast array of contact conditions, allowing robust sim-to-real transfer. Inspired by recent advances in real-time parallel simulation, this work considers instead the viability of online planning methods for contact-rich manipulation by studying the well-known in-hand cube reorientation task. We propose a simple architecture that employs a sampling-based predictive controller and vision-based pose estimator to search for contact-rich control actions online. We conduct thorough experiments to assess the real-world performance of our method, architectural design choices, and key factors for robustness, demonstrating that our simple sampling-based approach achieves performance comparable to prior RL-based works. Supplemental material: https://caltech-amber.github.io/drop.
comment: Extended version, updated appendix. Submitted to ICRA 2025
Bootstrapping Object-level Planning with Large Language Models
We introduce a new method that extracts knowledge from a large language model (LLM) to produce object-level plans, which describe high-level changes to object state, and uses them to bootstrap task and motion planning (TAMP) in a hierarchical manner. Existing works use LLMs to either directly output task plans or to generate goals in representations like PDDL. However, these methods fall short because they either rely on the LLM to do the actual planning or output a hard-to-satisfy goal. Our approach instead extracts knowledge from a LLM in the form of plan schemas as an object level representation called functional object-oriented networks (FOON), from which we automatically generate PDDL subgoals. Our experiments demonstrate how our method's performance markedly exceeds alternative planning strategies across several tasks in simulation.
comment: 11 pages (6 pages + 1 page references + 4 pages appendix)
The Brain-Inspired Cooperative Shared Control Framework for Brain-Machine Interface
In brain-machine interface (BMI) applications, a key challenge is the low information content and high noise level in neural signals, severely affecting stable robotic control. To address this challenge, we proposes a cooperative shared control framework based on brain-inspired intelligence, where control signals are decoded from neural activity, and the robot handles the fine control. This allows for a combination of flexible and adaptive interaction control between the robot and the brain, making intricate human-robot collaboration feasible. The proposed framework utilizes spiking neural networks (SNNs) for controlling robotic arm and wheel, including speed and steering. While full integration of the system remains a future goal, individual modules for robotic arm control, object tracking, and map generation have been successfully implemented. The framework is expected to significantly enhance the performance of BMI. In practical settings, the BMI with cooperative shared control, utilizing a brain-inspired algorithm, will greatly enhance the potential for clinical applications.
comment: This article need to update the content
Systems and Control (CS)
Optimal Inferential Control of Convolutional Neural Networks
Convolutional neural networks (CNNs) have achieved remarkable success in representing and simulating complex spatio-temporal dynamic systems within the burgeoning field of scientific machine learning. However, optimal control of CNNs poses a formidable challenge, because the ultra-high dimensionality and strong nonlinearity inherent in CNNs render them resistant to traditional gradient-based optimal control techniques. To tackle the challenge, we propose an optimal inferential control framework for CNNs that represent a complex spatio-temporal system, which sequentially infers the best control decisions based on the specified control objectives. This reformulation opens up the utilization of sequential Monte Carlo sampling, which is efficient in searching through high-dimensional spaces for nonlinear inference. We specifically leverage ensemble Kalman smoothing, a sequential Monte Carlo algorithm, to take advantage of its computational efficiency for nonlinear high-dimensional systems. Further, to harness graphics processing units (GPUs) to accelerate the computation, we develop a new sequential ensemble Kalman smoother based on matrix variate distributions. The smoother is capable of directly handling matrix-based inputs and outputs of CNNs without vectorization to fit with the parallelized computing architecture of GPUs. Numerical experiments show that the proposed approach is effective in controlling spatio-temporal systems with high-dimensional state and control spaces. All the code and data are available at https://github.com/Alivaziri/Optimal-Inferential-Control-of-CNNs.
LSTM-Based Proactive Congestion Management for Internet of Vehicle Networks
Vehicle-to-everything (V2X) networks support a variety of safety, entertainment, and commercial applications. This is realized by applying the principles of the Internet of Vehicles (IoV) to facilitate connectivity among vehicles and between vehicles and roadside units (RSUs). Network congestion management is essential for IoVs and it represents a significant concern due to its impact on improving the efficiency of transportation systems and providing reliable communication among vehicles for the timely delivery of safety-critical packets. This paper introduces a framework for proactive congestion management for IoV networks. We generate congestion scenarios and a data set to predict the congestion using LSTM. We present the framework and the packet congestion dataset. Simulation results using SUMO with NS3 demonstrate the effectiveness of the framework for forecasting IoV network congestion and clustering/prioritizing packets employing recurrent neural networks.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024
Advancing Experimental Platforms for UAV Communications: Insights from AERPAW'S Digital Twin
The rapid evolution of 5G and beyond has advanced space-air-terrestrial networks, with unmanned aerial vehicles (UAVs) offering enhanced coverage, flexible configurations, and cost efficiency. However, deploying UAV-based systems presents challenges including varying propagation conditions and hardware limitations. While simulators and theoretical models have been developed, real-world experimentation is critically important to validate the research. Digital twins, virtual replicas of physical systems, enable emulation that bridge theory and practice. This paper presents our experimental results from AERPAW's digital twin, showcasing its ability to simulate UAV communication scenarios and providing insights into system performance and reliability.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024--UAV Communication and Experimentation Workshop
Soft Tester UE: A Novel Approach for Open RAN Security Testing
With the rise of 5G and open radio access networks (O-RAN), there is a growing demand for customizable experimental platforms dedicated to security testing, as existing testbeds do not prioritize this area. Traditional, hardware-dependent testing methods pose challenges for smaller companies and research institutions. The growing wireless threat landscape highlights the critical need for proactive security testing, as 5G and O-RAN deployments are appealing targets for cybercriminals. To address these challenges, this article introduces the Soft Tester UE (soft T-UE), a software-defined test equipment designed to evaluate the security of 5G and O-RAN deployments via the Uu air interface between the user equipment (UE) and the network. The outcome is to deliver a free, open-source, and expandable test instrument to address the need for both standardized and customizable automated security testing. By extending beyond traditional security metrics, the soft T-UE promotes the development of new security measures and enhances the capability to anticipate and mitigate potential security breaches. The tool's automated testing capabilities are demonstrated through a scenario where the Radio Access Network (RAN) under test is evaluated when it receives fuzzed data when initiating a connection with an UE.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024--RitiRAN Workshop
Distributed Area Coverage Control with Imprecise Robot Localization
This article examines the problem of area coverage for a network of mobile robots with imprecise agent localization. Each robot has uniform radial sensing ability, governed by first order kinodynamics. The convex-space is partitioned based on the Guaranteed Voronoi (GV) principle and each robot's area of responsibility corresponds to its GV-cell, bounded by hyperbolic arcs. The proposed control law is distributed, demands the positioning information about its GV-Delaunay neighbors and has an inherent collision avoidance property.
comment: In proceedings of the 24th Mediterranean Conference on Control and Automation, 2016. 6 pages, 10 figures, video available at https://sotiris.papatheodorou.xyz/papers/2016_MED_PST/2016_MED_PST.mp4
Anomaly Detection and Inlet Pressure Prediction in Water Distribution Systems Using Machine Learning
This study presents two models to optimize pressure management in water distribution networks. The first model forecasts pressure at distribution points and compares predictions with actual data to detect anomalies such as leaks and blockages. Early detection allows for timely interventions, minimizing economic losses and ensuring system sustainability. The second model estimates the necessary inlet pressure based on the influence of various distribution points, ensuring consistent water supply while reducing waste and optimizing resource management. Both models utilize modern machine learning algorithms to enhance the prediction process. The methodology includes the CNN-EMD model, which analyzes historical data collected every 15 minutes over two months to predict future pressures. The Empirical Mode Decomposition (EMD) method identifies fluctuations and anomalies, improving prediction accuracy. The second model combines CNN, EMD, and LSTM techniques to forecast required inlet pressure, emphasizing the impact of distribution points. Results show that the CNN-EMD and CNN-EMD-LSTM models enhance pressure management capabilities, with the first model achieving an anomaly detection accuracy of 85% to 95% and the second model predicting inlet pressure with an average accuracy of 93%. This enables flexible system adjustments and identifies critical factors affecting inlet pressure. In conclusion, advanced machine learning models like CNN-EMD and LSTM significantly improve pressure management in water distribution networks, facilitating early issue identification, ensuring efficient water supply, and optimizing resource management for future generations.
comment: 13 pages, 14 figures
Towards Design and Development of a Low-Cost Unmanned Surface Vehicle for Aquaculture Water Quality Monitoring in Shallow Water Environments
Unmanned surface vessels USVs are typically autonomous or remotely operated and are specifically designed for environmental monitoring in various aquatic environments Aquaculture requires constant monitoring and management of water quality for the health and productivity of aquaculture systems Poor water quality can lead to disease outbreaks reduced growth rates and even mass mortality of cultured species Many small aquaculture operations operate on tight budgets and in shallow water environments such as inland ponds coastal lagoons estuaries and shallow rivers particularly in developing regions This leads to the foremost manoeuvrability challenge underscoring the crucial need for agile cost effective USVs as efficient monitoring systems The paper proposes a low cost 3D printed twin hull catamaran style platform equipped with an Inertial Measurement Unit IMU and a Global Navigation Satellite System GNSS with a two layered control framework and a differential drive configuration developed using two high efficiency T200 thrusters The design utilizes the Robot Operating System ROS to create the control framework and incorporates Extended Kalman Filter EKF based sensor fusion techniques for localisation The paper evaluates the USVs autonomy through open water captive model experiments employing remote control methods to assess the vessels manoeuvrability and overall performance in shallow water conditions
Quantify Gas-to-Power Fault Propagation Speed:A Semi-Implicit Simulation Approach
Relying heavily on the secure supply of natural gas, the modern clean electric power systems are prone to the gas disturbances induced by the inherent rupture and leakage faults. For the first time, this paper studies the cross-system propagation speed of these faults using a simulation-based approach. Firstly, we establish the differential algebraic equation models of the rupture and leakage faults respectively. The boundary conditions at the fault locations are derived using the method of characteristics. Secondly, we propose utilizing a semi-implicit approach to perform post-fault simulations. The approach, based on the stiffly-accurate Rosenbrock scheme, possesses the implicit numerical stability and explicit computation burdens. Therefore, the high-dimensional and multi-time-scale stiff models can be solved in an efficient and robust way. Thirdly, to accurately locate the simulation events, which can not be predicted a priori, we propose a critical-time-location strategy based on the continuous Runge-Kutta approach. In case studies, we verified the accuracy and the efficiency superiority of the proposed simulation approach. The impacts of gas faults on gas and power dynamics were investigated by simulation, where the critical events were identified accurately. We found that the fault propagation speed mainly depends on the fault position and is influenced by the pipe frictions. The bi-directional coupling between gas and power may lead to cascading failures.
Procedural Generation of Communication Networks in Power Systems
Power system communication networks enable operators to remotely monitor and control field equipment. The sophistication of these networks is also increasing as operators continue the trend towards digitization, which is beneficial in integrating distributed energy resources. However, as the attack surface increases in size so too does the risk of cyberattacks. The topology, configuration and composition of communication networks is therefore confidential since this can provide information to attackers. As a result, the number of benchmarks available for research purposes is limited. A tool for procedurally generating communication network topologies is therefore proposed. While primarily intended as an enabler for public research into communication networks, this tool also allows general insights to be gained into the effect of communication network design on the vulnerability of networks to cyberattacks. The tool includes the ability to encapsulate network characteristics in JSON specification files, which is demonstrated with example Advanced Metering Infrastructure (AMI), Supervisory Control and Data Acquisition (SCADA) and Wide Area Monitoring (WAM) specification files. The SCADA network generation is then compared to a real-world case. Finally, the effect of network redundancy on the networks cyber resilience is investigated.
comment: 9 pages, 11 figures, originally presented at the 15th DACH+ Energy Informatics Doctoral Workshop
A Framework to Estimate Life Cycle Emissions for Vehicle-Integrated Photovoltaic Systems
This paper presents a framework to estimate the environmental impact of solar electric vehicles, accounting for the emissions caused by photovoltaic system production as well as vehicle use. We leverage a cradle-to-gate life cycle assessment to estimate the greenhouse gas emissions of the vehicle-integrated photovoltaic system, from the raw material extraction to the final panel assembly, including the effect of the electricity mix both at the factory location and in the country of use. %the vehicle's life cycle, considering both Furthermore, we modify an existing optimization framework for battery electric vehicles to optimally design a solar electric vehicle and estimate its energy consumption. We showcase our framework by analyzing a case study where the mono-crystalline silicon extraction and refinement processes occur in China, while the final assembly of the panel is in The Netherlands, generating 118 kg of CO2 equivalents per square meter of solar panel. The results suggest that it is generally beneficial to operate solar electric vehicles in countries with a high irradiation index. However, when the local electricity mix already displays a low carbon intensity, the additional emissions introduced by the panel are unnecessary, requiring a longer vehicle lifetime to reach an advantageous emission balance.
comment: 6 pages, 8 figures, 2024 IEEE Vehicle Power and Propulsion Conference, Best Paper Award
Directed Testing of ORAN using a Partially Specified Declarative Digital Twin
Real Time performance testing can be divided into two distinct parts: system test and algorithm test. System test checks that the right functions operate on the right data within power, latency, and other constraints under all conditions. Major RAN OEMs, put as much effort into system test and debug as they do into algorithm test, to ensure a competitive product. An algorithm tester will provide little insight into real time and hardware-software (HW-SW) capacity as it is unaware of the system implementation. In this paper we present an innovative Digital Twin technology, which we call Declarative Digital Twin (DDT). A DDT can describe the system requirements of the RAN such that critical corner cases can be found via automation, that would normally be missed by conventional testing. This is possible even when the RAN requirements are only partially specified. We present a Domain Specific Language (DSL) for declarative description of the RAN and show results from an automated solver that demonstrate how potential HW-SW implementation related corner cases can be identified from the DDT of an ORAN DU.
comment: 5 pages, 7 figures, 1 table, presented at the First RitiRAN Workshop co-located with VTC Fall 2024
The Brain-Inspired Cooperative Shared Control Framework for Brain-Machine Interface
In brain-machine interface (BMI) applications, a key challenge is the low information content and high noise level in neural signals, severely affecting stable robotic control. To address this challenge, we proposes a cooperative shared control framework based on brain-inspired intelligence, where control signals are decoded from neural activity, and the robot handles the fine control. This allows for a combination of flexible and adaptive interaction control between the robot and the brain, making intricate human-robot collaboration feasible. The proposed framework utilizes spiking neural networks (SNNs) for controlling robotic arm and wheel, including speed and steering. While full integration of the system remains a future goal, individual modules for robotic arm control, object tracking, and map generation have been successfully implemented. The framework is expected to significantly enhance the performance of BMI. In practical settings, the BMI with cooperative shared control, utilizing a brain-inspired algorithm, will greatly enhance the potential for clinical applications.
comment: This article need to update the content
Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data ICLR 2024
A powerful concept behind much of the recent progress in machine learning is the extraction of common features across data from heterogeneous sources or tasks. Intuitively, using all of one's data to learn a common representation function benefits both computational effort and statistical generalization by leaving a smaller number of parameters to fine-tune on a given task. Toward theoretically grounding these merits, we propose a general setting of recovering linear operators $M$ from noisy vector measurements $y = Mx + w$, where the covariates $x$ may be both non-i.i.d. and non-isotropic. We demonstrate that existing isotropy-agnostic representation learning approaches incur biases on the representation update, which causes the scaling of the noise terms to lose favorable dependence on the number of source tasks. This in turn can cause the sample complexity of representation learning to be bottlenecked by the single-task data size. We introduce an adaptation, $\texttt{De-bias & Feature-Whiten}$ ($\texttt{DFW}$), of the popular alternating minimization-descent scheme proposed independently in Collins et al., (2021) and Nayer and Vaswani (2022), and establish linear convergence to the optimal representation with noise level scaling down with the $\textit{total}$ source data size. This leads to generalization bounds on the same order as an oracle empirical risk minimizer. We verify the vital importance of $\texttt{DFW}$ on various numerical simulations. In particular, we show that vanilla alternating-minimization descent fails catastrophically even for iid, but mildly non-isotropic data. Our analysis unifies and generalizes prior work, and provides a flexible framework for a wider range of applications, such as in controls and dynamical systems.
comment: Appeared at ICLR 2024 (spotlight presentation)
Self-tuning moving horizon estimation of nonlinear systems via physics-informed machine learning Koopman modeling
In this paper, we propose a physics-informed learning-based Koopman modeling approach and present a Koopman-based self-tuning moving horizon estimation design for a class of nonlinear systems. Specifically, we train Koopman operators and two neural networks - the state lifting network and the noise characterization network - using both data and available physical information. The two neural networks account for the nonlinear lifting functions for Koopman modeling and describing system noise distributions, respectively. Accordingly, a stochastic linear Koopman model is established in the lifted space to forecast the dynamic behavior of the nonlinear system. Based on the Koopman model, a self-tuning linear moving horizon estimation (MHE) scheme is developed. The weighting matrices of the MHE design are updated using the pre-trained noise characterization network at each sampling instant. The proposed estimation scheme is computationally efficient because only convex optimization is involved during online implementation, and updating the weighting matrices of the MHE scheme does not require re-training the neural networks. We verify the effectiveness and evaluate the performance of the proposed method via the application to a simulated chemical process.
comment: 31 pages, 7 figures
Hybrid Feedback for Three-dimensional Convex Obstacle Avoidance (Extended version)
We propose a hybrid feedback control scheme for the autonomous robot navigation problem in three-dimensional environments with arbitrarily-shaped convex obstacles. The proposed hybrid control strategy, which consists in switching between the move-to-target mode and the obstacle-avoidance mode, guarantees global asymptotic stability of the target location in the obstacle-free workspace. We also provide a procedure for the implementation of the proposed hybrid controller in a priori unknown environments and validate its effectiveness through simulation results.
comment: 11 pages, 3 figures
Online Control with Adversarial Disturbance for Continuous-time Linear Systems
We study online control for continuous-time linear systems with finite sampling rates, where the objective is to design an online procedure that learns under non-stochastic noise and performs comparably to a fixed optimal linear controller. We present a novel two-level online algorithm, by integrating a higher-level learning strategy and a lower-level feedback control strategy. This method offers a practical and robust solution for online control, which achieves sublinear regret. Our work provides the first nonasymptotic results for controlling continuous-time linear systems with finite number of interactions with the system. Moreover, we examine how to train an agent in domain randomization environments from a non-stochastic control perspective. By applying our method to the SAC (Soft Actor-Critic) algorithm, we achieved improved results in multiple reinforcement learning tasks within domain randomization environments. Our work provides new insights into non-asymptotic analyses of controlling continuous-time systems. Furthermore, our work brings practical intuition into controller learning under non-stochastic environments.
Study on the Time Domain Precision Evolution Mechanism of CNC Machine Tool Feed Systems Based on Acceleration and Deceleration Capability Indicator
The escalating demand for high-speed and high-precision machining in machine tool feed system has brought to the forefront the challenge of its design method. Currently, existing methodologies struggle to ascertain compliance with dynamic performance requirements during the design phase, often resulting in either excessive or insufficient design. Therefore, there is an urgent need for research focused on feed system design methods that directly address time domain dynamic precision. The dynamic precision of the feed system is influenced by the motor, mechanical structure, motion processes, and control system. However, existing studies on the impact mechanisms of electromechanical matching on feed system precision often overlook the roles of control and motion processes. This paper innovatively proposes the need to consider the coupling effects among subsystems, directing the optimization design of CNC machine tool feed systems towards time domain dynamic precision. Furthermore, it introduces acceleration and deceleration capability as a key indicator of electromechanical matching. Following the decoupling of control system parameters, this study elucidates the influence mechanisms of electromechanical matching on the overall dynamic performance of the feed system under various motion processes. This research offers a novel design philosophy and theoretical foundation for the optimization of CNC machine tool feed systems.
Systems and Control (EESS)
Optimal Inferential Control of Convolutional Neural Networks
Convolutional neural networks (CNNs) have achieved remarkable success in representing and simulating complex spatio-temporal dynamic systems within the burgeoning field of scientific machine learning. However, optimal control of CNNs poses a formidable challenge, because the ultra-high dimensionality and strong nonlinearity inherent in CNNs render them resistant to traditional gradient-based optimal control techniques. To tackle the challenge, we propose an optimal inferential control framework for CNNs that represent a complex spatio-temporal system, which sequentially infers the best control decisions based on the specified control objectives. This reformulation opens up the utilization of sequential Monte Carlo sampling, which is efficient in searching through high-dimensional spaces for nonlinear inference. We specifically leverage ensemble Kalman smoothing, a sequential Monte Carlo algorithm, to take advantage of its computational efficiency for nonlinear high-dimensional systems. Further, to harness graphics processing units (GPUs) to accelerate the computation, we develop a new sequential ensemble Kalman smoother based on matrix variate distributions. The smoother is capable of directly handling matrix-based inputs and outputs of CNNs without vectorization to fit with the parallelized computing architecture of GPUs. Numerical experiments show that the proposed approach is effective in controlling spatio-temporal systems with high-dimensional state and control spaces. All the code and data are available at https://github.com/Alivaziri/Optimal-Inferential-Control-of-CNNs.
LSTM-Based Proactive Congestion Management for Internet of Vehicle Networks
Vehicle-to-everything (V2X) networks support a variety of safety, entertainment, and commercial applications. This is realized by applying the principles of the Internet of Vehicles (IoV) to facilitate connectivity among vehicles and between vehicles and roadside units (RSUs). Network congestion management is essential for IoVs and it represents a significant concern due to its impact on improving the efficiency of transportation systems and providing reliable communication among vehicles for the timely delivery of safety-critical packets. This paper introduces a framework for proactive congestion management for IoV networks. We generate congestion scenarios and a data set to predict the congestion using LSTM. We present the framework and the packet congestion dataset. Simulation results using SUMO with NS3 demonstrate the effectiveness of the framework for forecasting IoV network congestion and clustering/prioritizing packets employing recurrent neural networks.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024
Advancing Experimental Platforms for UAV Communications: Insights from AERPAW'S Digital Twin
The rapid evolution of 5G and beyond has advanced space-air-terrestrial networks, with unmanned aerial vehicles (UAVs) offering enhanced coverage, flexible configurations, and cost efficiency. However, deploying UAV-based systems presents challenges including varying propagation conditions and hardware limitations. While simulators and theoretical models have been developed, real-world experimentation is critically important to validate the research. Digital twins, virtual replicas of physical systems, enable emulation that bridge theory and practice. This paper presents our experimental results from AERPAW's digital twin, showcasing its ability to simulate UAV communication scenarios and providing insights into system performance and reliability.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024--UAV Communication and Experimentation Workshop
Soft Tester UE: A Novel Approach for Open RAN Security Testing
With the rise of 5G and open radio access networks (O-RAN), there is a growing demand for customizable experimental platforms dedicated to security testing, as existing testbeds do not prioritize this area. Traditional, hardware-dependent testing methods pose challenges for smaller companies and research institutions. The growing wireless threat landscape highlights the critical need for proactive security testing, as 5G and O-RAN deployments are appealing targets for cybercriminals. To address these challenges, this article introduces the Soft Tester UE (soft T-UE), a software-defined test equipment designed to evaluate the security of 5G and O-RAN deployments via the Uu air interface between the user equipment (UE) and the network. The outcome is to deliver a free, open-source, and expandable test instrument to address the need for both standardized and customizable automated security testing. By extending beyond traditional security metrics, the soft T-UE promotes the development of new security measures and enhances the capability to anticipate and mitigate potential security breaches. The tool's automated testing capabilities are demonstrated through a scenario where the Radio Access Network (RAN) under test is evaluated when it receives fuzzed data when initiating a connection with an UE.
comment: This article has been accepted for publication in the IEEE VTC Fall 2024--RitiRAN Workshop
Distributed Area Coverage Control with Imprecise Robot Localization
This article examines the problem of area coverage for a network of mobile robots with imprecise agent localization. Each robot has uniform radial sensing ability, governed by first order kinodynamics. The convex-space is partitioned based on the Guaranteed Voronoi (GV) principle and each robot's area of responsibility corresponds to its GV-cell, bounded by hyperbolic arcs. The proposed control law is distributed, demands the positioning information about its GV-Delaunay neighbors and has an inherent collision avoidance property.
comment: In proceedings of the 24th Mediterranean Conference on Control and Automation, 2016. 6 pages, 10 figures, video available at https://sotiris.papatheodorou.xyz/papers/2016_MED_PST/2016_MED_PST.mp4
Anomaly Detection and Inlet Pressure Prediction in Water Distribution Systems Using Machine Learning
This study presents two models to optimize pressure management in water distribution networks. The first model forecasts pressure at distribution points and compares predictions with actual data to detect anomalies such as leaks and blockages. Early detection allows for timely interventions, minimizing economic losses and ensuring system sustainability. The second model estimates the necessary inlet pressure based on the influence of various distribution points, ensuring consistent water supply while reducing waste and optimizing resource management. Both models utilize modern machine learning algorithms to enhance the prediction process. The methodology includes the CNN-EMD model, which analyzes historical data collected every 15 minutes over two months to predict future pressures. The Empirical Mode Decomposition (EMD) method identifies fluctuations and anomalies, improving prediction accuracy. The second model combines CNN, EMD, and LSTM techniques to forecast required inlet pressure, emphasizing the impact of distribution points. Results show that the CNN-EMD and CNN-EMD-LSTM models enhance pressure management capabilities, with the first model achieving an anomaly detection accuracy of 85% to 95% and the second model predicting inlet pressure with an average accuracy of 93%. This enables flexible system adjustments and identifies critical factors affecting inlet pressure. In conclusion, advanced machine learning models like CNN-EMD and LSTM significantly improve pressure management in water distribution networks, facilitating early issue identification, ensuring efficient water supply, and optimizing resource management for future generations.
comment: 13 pages, 14 figures
Towards Design and Development of a Low-Cost Unmanned Surface Vehicle for Aquaculture Water Quality Monitoring in Shallow Water Environments
Unmanned surface vessels USVs are typically autonomous or remotely operated and are specifically designed for environmental monitoring in various aquatic environments Aquaculture requires constant monitoring and management of water quality for the health and productivity of aquaculture systems Poor water quality can lead to disease outbreaks reduced growth rates and even mass mortality of cultured species Many small aquaculture operations operate on tight budgets and in shallow water environments such as inland ponds coastal lagoons estuaries and shallow rivers particularly in developing regions This leads to the foremost manoeuvrability challenge underscoring the crucial need for agile cost effective USVs as efficient monitoring systems The paper proposes a low cost 3D printed twin hull catamaran style platform equipped with an Inertial Measurement Unit IMU and a Global Navigation Satellite System GNSS with a two layered control framework and a differential drive configuration developed using two high efficiency T200 thrusters The design utilizes the Robot Operating System ROS to create the control framework and incorporates Extended Kalman Filter EKF based sensor fusion techniques for localisation The paper evaluates the USVs autonomy through open water captive model experiments employing remote control methods to assess the vessels manoeuvrability and overall performance in shallow water conditions
Quantify Gas-to-Power Fault Propagation Speed:A Semi-Implicit Simulation Approach
Relying heavily on the secure supply of natural gas, the modern clean electric power systems are prone to the gas disturbances induced by the inherent rupture and leakage faults. For the first time, this paper studies the cross-system propagation speed of these faults using a simulation-based approach. Firstly, we establish the differential algebraic equation models of the rupture and leakage faults respectively. The boundary conditions at the fault locations are derived using the method of characteristics. Secondly, we propose utilizing a semi-implicit approach to perform post-fault simulations. The approach, based on the stiffly-accurate Rosenbrock scheme, possesses the implicit numerical stability and explicit computation burdens. Therefore, the high-dimensional and multi-time-scale stiff models can be solved in an efficient and robust way. Thirdly, to accurately locate the simulation events, which can not be predicted a priori, we propose a critical-time-location strategy based on the continuous Runge-Kutta approach. In case studies, we verified the accuracy and the efficiency superiority of the proposed simulation approach. The impacts of gas faults on gas and power dynamics were investigated by simulation, where the critical events were identified accurately. We found that the fault propagation speed mainly depends on the fault position and is influenced by the pipe frictions. The bi-directional coupling between gas and power may lead to cascading failures.
Procedural Generation of Communication Networks in Power Systems
Power system communication networks enable operators to remotely monitor and control field equipment. The sophistication of these networks is also increasing as operators continue the trend towards digitization, which is beneficial in integrating distributed energy resources. However, as the attack surface increases in size so too does the risk of cyberattacks. The topology, configuration and composition of communication networks is therefore confidential since this can provide information to attackers. As a result, the number of benchmarks available for research purposes is limited. A tool for procedurally generating communication network topologies is therefore proposed. While primarily intended as an enabler for public research into communication networks, this tool also allows general insights to be gained into the effect of communication network design on the vulnerability of networks to cyberattacks. The tool includes the ability to encapsulate network characteristics in JSON specification files, which is demonstrated with example Advanced Metering Infrastructure (AMI), Supervisory Control and Data Acquisition (SCADA) and Wide Area Monitoring (WAM) specification files. The SCADA network generation is then compared to a real-world case. Finally, the effect of network redundancy on the networks cyber resilience is investigated.
comment: 9 pages, 11 figures, originally presented at the 15th DACH+ Energy Informatics Doctoral Workshop
A Framework to Estimate Life Cycle Emissions for Vehicle-Integrated Photovoltaic Systems
This paper presents a framework to estimate the environmental impact of solar electric vehicles, accounting for the emissions caused by photovoltaic system production as well as vehicle use. We leverage a cradle-to-gate life cycle assessment to estimate the greenhouse gas emissions of the vehicle-integrated photovoltaic system, from the raw material extraction to the final panel assembly, including the effect of the electricity mix both at the factory location and in the country of use. %the vehicle's life cycle, considering both Furthermore, we modify an existing optimization framework for battery electric vehicles to optimally design a solar electric vehicle and estimate its energy consumption. We showcase our framework by analyzing a case study where the mono-crystalline silicon extraction and refinement processes occur in China, while the final assembly of the panel is in The Netherlands, generating 118 kg of CO2 equivalents per square meter of solar panel. The results suggest that it is generally beneficial to operate solar electric vehicles in countries with a high irradiation index. However, when the local electricity mix already displays a low carbon intensity, the additional emissions introduced by the panel are unnecessary, requiring a longer vehicle lifetime to reach an advantageous emission balance.
comment: 6 pages, 8 figures, 2024 IEEE Vehicle Power and Propulsion Conference, Best Paper Award
Directed Testing of ORAN using a Partially Specified Declarative Digital Twin
Real Time performance testing can be divided into two distinct parts: system test and algorithm test. System test checks that the right functions operate on the right data within power, latency, and other constraints under all conditions. Major RAN OEMs, put as much effort into system test and debug as they do into algorithm test, to ensure a competitive product. An algorithm tester will provide little insight into real time and hardware-software (HW-SW) capacity as it is unaware of the system implementation. In this paper we present an innovative Digital Twin technology, which we call Declarative Digital Twin (DDT). A DDT can describe the system requirements of the RAN such that critical corner cases can be found via automation, that would normally be missed by conventional testing. This is possible even when the RAN requirements are only partially specified. We present a Domain Specific Language (DSL) for declarative description of the RAN and show results from an automated solver that demonstrate how potential HW-SW implementation related corner cases can be identified from the DDT of an ORAN DU.
comment: 5 pages, 7 figures, 1 table, presented at the First RitiRAN Workshop co-located with VTC Fall 2024
The Brain-Inspired Cooperative Shared Control Framework for Brain-Machine Interface
In brain-machine interface (BMI) applications, a key challenge is the low information content and high noise level in neural signals, severely affecting stable robotic control. To address this challenge, we proposes a cooperative shared control framework based on brain-inspired intelligence, where control signals are decoded from neural activity, and the robot handles the fine control. This allows for a combination of flexible and adaptive interaction control between the robot and the brain, making intricate human-robot collaboration feasible. The proposed framework utilizes spiking neural networks (SNNs) for controlling robotic arm and wheel, including speed and steering. While full integration of the system remains a future goal, individual modules for robotic arm control, object tracking, and map generation have been successfully implemented. The framework is expected to significantly enhance the performance of BMI. In practical settings, the BMI with cooperative shared control, utilizing a brain-inspired algorithm, will greatly enhance the potential for clinical applications.
comment: This article need to update the content
Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data ICLR 2024
A powerful concept behind much of the recent progress in machine learning is the extraction of common features across data from heterogeneous sources or tasks. Intuitively, using all of one's data to learn a common representation function benefits both computational effort and statistical generalization by leaving a smaller number of parameters to fine-tune on a given task. Toward theoretically grounding these merits, we propose a general setting of recovering linear operators $M$ from noisy vector measurements $y = Mx + w$, where the covariates $x$ may be both non-i.i.d. and non-isotropic. We demonstrate that existing isotropy-agnostic representation learning approaches incur biases on the representation update, which causes the scaling of the noise terms to lose favorable dependence on the number of source tasks. This in turn can cause the sample complexity of representation learning to be bottlenecked by the single-task data size. We introduce an adaptation, $\texttt{De-bias & Feature-Whiten}$ ($\texttt{DFW}$), of the popular alternating minimization-descent scheme proposed independently in Collins et al., (2021) and Nayer and Vaswani (2022), and establish linear convergence to the optimal representation with noise level scaling down with the $\textit{total}$ source data size. This leads to generalization bounds on the same order as an oracle empirical risk minimizer. We verify the vital importance of $\texttt{DFW}$ on various numerical simulations. In particular, we show that vanilla alternating-minimization descent fails catastrophically even for iid, but mildly non-isotropic data. Our analysis unifies and generalizes prior work, and provides a flexible framework for a wider range of applications, such as in controls and dynamical systems.
comment: Appeared at ICLR 2024 (spotlight presentation)
Self-tuning moving horizon estimation of nonlinear systems via physics-informed machine learning Koopman modeling
In this paper, we propose a physics-informed learning-based Koopman modeling approach and present a Koopman-based self-tuning moving horizon estimation design for a class of nonlinear systems. Specifically, we train Koopman operators and two neural networks - the state lifting network and the noise characterization network - using both data and available physical information. The two neural networks account for the nonlinear lifting functions for Koopman modeling and describing system noise distributions, respectively. Accordingly, a stochastic linear Koopman model is established in the lifted space to forecast the dynamic behavior of the nonlinear system. Based on the Koopman model, a self-tuning linear moving horizon estimation (MHE) scheme is developed. The weighting matrices of the MHE design are updated using the pre-trained noise characterization network at each sampling instant. The proposed estimation scheme is computationally efficient because only convex optimization is involved during online implementation, and updating the weighting matrices of the MHE scheme does not require re-training the neural networks. We verify the effectiveness and evaluate the performance of the proposed method via the application to a simulated chemical process.
comment: 31 pages, 7 figures
Hybrid Feedback for Three-dimensional Convex Obstacle Avoidance (Extended version)
We propose a hybrid feedback control scheme for the autonomous robot navigation problem in three-dimensional environments with arbitrarily-shaped convex obstacles. The proposed hybrid control strategy, which consists in switching between the move-to-target mode and the obstacle-avoidance mode, guarantees global asymptotic stability of the target location in the obstacle-free workspace. We also provide a procedure for the implementation of the proposed hybrid controller in a priori unknown environments and validate its effectiveness through simulation results.
comment: 11 pages, 3 figures
Online Control with Adversarial Disturbance for Continuous-time Linear Systems
We study online control for continuous-time linear systems with finite sampling rates, where the objective is to design an online procedure that learns under non-stochastic noise and performs comparably to a fixed optimal linear controller. We present a novel two-level online algorithm, by integrating a higher-level learning strategy and a lower-level feedback control strategy. This method offers a practical and robust solution for online control, which achieves sublinear regret. Our work provides the first nonasymptotic results for controlling continuous-time linear systems with finite number of interactions with the system. Moreover, we examine how to train an agent in domain randomization environments from a non-stochastic control perspective. By applying our method to the SAC (Soft Actor-Critic) algorithm, we achieved improved results in multiple reinforcement learning tasks within domain randomization environments. Our work provides new insights into non-asymptotic analyses of controlling continuous-time systems. Furthermore, our work brings practical intuition into controller learning under non-stochastic environments.
Study on the Time Domain Precision Evolution Mechanism of CNC Machine Tool Feed Systems Based on Acceleration and Deceleration Capability Indicator
The escalating demand for high-speed and high-precision machining in machine tool feed system has brought to the forefront the challenge of its design method. Currently, existing methodologies struggle to ascertain compliance with dynamic performance requirements during the design phase, often resulting in either excessive or insufficient design. Therefore, there is an urgent need for research focused on feed system design methods that directly address time domain dynamic precision. The dynamic precision of the feed system is influenced by the motor, mechanical structure, motion processes, and control system. However, existing studies on the impact mechanisms of electromechanical matching on feed system precision often overlook the roles of control and motion processes. This paper innovatively proposes the need to consider the coupling effects among subsystems, directing the optimization design of CNC machine tool feed systems towards time domain dynamic precision. Furthermore, it introduces acceleration and deceleration capability as a key indicator of electromechanical matching. Following the decoupling of control system parameters, this study elucidates the influence mechanisms of electromechanical matching on the overall dynamic performance of the feed system under various motion processes. This research offers a novel design philosophy and theoretical foundation for the optimization of CNC machine tool feed systems.
Multiagent Systems
Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation
The rapid advancement of scientific progress requires innovative tools that can accelerate discovery. While recent AI methods, particularly large language models (LLMs), have shown promise in tasks such as hypothesis generation and experimental design, they fall short in replicating the collaborative nature of real-world scientific practices, where diverse teams of experts work together to tackle complex problems. To address the limitation, we propose an LLM-based multi-agent system, i.e., Virtual Scientists (VirSci), designed to mimic the teamwork inherent in scientific research. VirSci organizes a team of agents to collaboratively generate, evaluate, and refine research ideas. Through comprehensive experiments, we demonstrate that this multi-agent approach outperforms the state-of-the-art method in producing novel and impactful scientific ideas, showing potential in aligning with key insights in the Science of Science field. Our findings suggest that integrating collaborative agents can lead to more innovative scientific outputs, offering a robust system for autonomous scientific discovery.
Distributed Optimization Methods for Multi-Robot Systems: Part II -- A Survey
Although the field of distributed optimization is well-developed, relevant literature focused on the application of distributed optimization to multi-robot problems is limited. This survey constitutes the second part of a two-part series on distributed optimization applied to multi-robot problems. In this paper, we survey three main classes of distributed optimization algorithms -- distributed first-order methods, distributed sequential convex programming methods, and alternating direction method of multipliers (ADMM) methods -- focusing on fully-distributed methods that do not require coordination or computation by a central computer. We describe the fundamental structure of each category and note important variations around this structure, designed to address its associated drawbacks. Further, we provide practical implications of noteworthy assumptions made by distributed optimization algorithms, noting the classes of robotics problems suitable for these algorithms. Moreover, we identify important open research challenges in distributed optimization, specifically for robotics problems.
comment: arXiv admin note: substantial text overlap with arXiv:2103.12840
Distributed Optimization Methods for Multi-Robot Systems: Part I -- A Tutorial
Distributed optimization provides a framework for deriving distributed algorithms for a variety of multi-robot problems. This tutorial constitutes the first part of a two-part series on distributed optimization applied to multi-robot problems, which seeks to advance the application of distributed optimization in robotics. In this tutorial, we demonstrate that many canonical multi-robot problems can be cast within the distributed optimization framework, such as multi-robot simultaneous localization and planning (SLAM), multi-robot target tracking, and multi-robot task assignment problems. We identify three broad categories of distributed optimization algorithms: distributed first-order methods, distributed sequential convex programming, and the alternating direction method of multipliers (ADMM). We describe the basic structure of each category and provide representative algorithms within each category. We then work through a simulation case study of multiple drones collaboratively tracking a ground vehicle. We compare solutions to this problem using a number of different distributed optimization algorithms. In addition, we implement a distributed optimization algorithm in hardware on a network of Rasberry Pis communicating with XBee modules to illustrate robustness to the challenges of real-world communication networks.
Robotics
Design and Performance Evaluation of an Elbow-Based Biomechanical Energy Harvester
Carbon emissions have long been attributed to the increase in climate change. With the effects of climate change escalating in the past few years, there has been an increased effort to find green alternatives to power generation, which has been a major contributor to carbon emissions. One prominent way that has arisen is biomechanical energy, or harvesting energy based on natural human movement. This study will evaluate the feasibility of electric generation using a gear and generator-based biomechanical energy harvester in the elbow joint. The joint was chosen using kinetic arm analysis through MediaPipe, in which the elbow joint showed much higher angular velocity during walking, thus showing more potential as a place to construct the harvester. Leg joints were excluded to not obstruct daily movement. The gear and generator type was decided to maximize energy production in the elbow joint. The device was constructed using a gearbox and a generator. The results show that it generated as much as 0.16 watts using the optimal resistance. This demonstrates the feasibility of electric generation with an elbow joint gear and generator-type biomechanical energy harvester.
comment: 8 pages, 9 figures
Design and Control of an Omnidirectional Aerial Robot with a Miniaturized Haptic Joystick for Physical Interaction
Fully actuated aerial robot proved their superiority for Aerial Physical Interaction (APhI) over the past years. This work proposes a minimal setup for aerial telemanipulation, enhancing accessibility of these technologies. The design and the control of a 6-DoF joystick with 4-DoF haptic feedback is detailed. It is the first haptic device with standard Remote Controller (RC) form factor for APhI. By miniaturizing haptic device, it enhances RC with the sense of touch, increasing physical awareness. The goal is to give operators an extra sense, other than vision and sound, to help to perform safe APhI. To the best of the authors knowledge, this is the first teleoperation system able to decouple each single axis input command. On the omnidirectional quadrotor, by reducing the number of components with a new design, we aim a simplified maintenance, and improved force and thrust to weight ratio. Open-sourced physic based simulation and successful preliminary flight tests highlighted the tool as promising for future APhI applications.
comment: 6 pages, 6 figures
Voxel-SLAM: A Complete, Accurate, and Versatile LiDAR-Inertial SLAM System
In this work, we present Voxel-SLAM: a complete, accurate, and versatile LiDAR-inertial SLAM system that fully utilizes short-term, mid-term, long-term, and multi-map data associations to achieve real-time estimation and high precision mapping. The system consists of five modules: initialization, odometry, local mapping, loop closure, and global mapping, all employing the same map representation, an adaptive voxel map. The initialization provides an accurate initial state estimation and a consistent local map for subsequent modules, enabling the system to start with a highly dynamic initial state. The odometry, exploiting the short-term data association, rapidly estimates current states and detects potential system divergence. The local mapping, exploiting the mid-term data association, employs a local LiDAR-inertial bundle adjustment (BA) to refine the states (and the local map) within a sliding window of recent LiDAR scans. The loop closure detects previously visited places in the current and all previous sessions. The global mapping refines the global map with an efficient hierarchical global BA. The loop closure and global mapping both exploit long-term and multi-map data associations. We conducted a comprehensive benchmark comparison with other state-of-the-art methods across 30 sequences from three representative scenes, including narrow indoor environments using hand-held equipment, large-scale wilderness environments with aerial robots, and urban environments on vehicle platforms. Other experiments demonstrate the robustness and efficiency of the initialization, the capacity to work in multiple sessions, and relocalization in degenerated environments.
Implicit Graph Search for Planning on Graphs of Convex Sets
Graphs of Convex Sets (GCS) is a recent method for synthesizing smooth trajectories by decomposing the planning space into convex sets, forming a graph to encode the adjacency relationships within the decomposition, and then simultaneously searching this graph and optimizing parts of the trajectory to obtain the final trajectory. To do this, one must solve a Mixed Integer Convex Program (MICP) and to mitigate computational time, GCS proposes a convex relaxation that is empirically very tight. Despite this tight relaxation, motion planning with GCS for real-world robotics problems translates to solving the simultaneous batch optimization problem that may contain millions of constraints and therefore can be slow. This is further exacerbated by the fact that the size of the GCS problem is invariant to the planning query. Motivated by the observation that the trajectory solution lies only on a fraction of the set of convex sets, we present two implicit graph search methods for planning on the graph of convex sets called INSATxGCS (IxG) and IxG*. INterleaved Search And Trajectory optimization (INSAT) is a previously developed algorithm that alternates between searching on a graph and optimizing partial paths to find a smooth trajectory. By using an implicit graph search method INSAT on the graph of convex sets, we achieve faster planning while ensuring stronger guarantees on completeness and optimality. Moveover, introducing a search-based technique to plan on the graph of convex sets enables us to easily leverage well-established techniques such as search parallelization, lazy planning, anytime planning, and replanning as future work. Numerical comparisons against GCS demonstrate the superiority of IxG across several applications, including planning for an 18-degree-of-freedom multi-arm assembly scenario.
Dynamic Benchmarks: Spatial and Temporal Alignment for ADS Performance Evaluation
Deployed SAE level 4+ Automated Driving Systems (ADS) without a human driver are currently operational ride-hailing fleets on surface streets in the United States. This current use case and future applications of this technology will determine where and when the fleets operate, potentially resulting in a divergence from the distribution of driving of some human benchmark population within a given locality. Existing benchmarks for evaluating ADS performance have only done county-level geographical matching of the ADS and benchmark driving exposure in crash rates. This study presents a novel methodology for constructing dynamic human benchmarks that adjust for spatial and temporal variations in driving distribution between an ADS and the overall human driven fleet. Dynamic benchmarks were generated using human police-reported crash data, human vehicle miles traveled (VMT) data, and over 20 million miles of Waymo's rider-only (RO) operational data accumulated across three US counties. The spatial adjustment revealed significant differences across various severity levels in adjusted crash rates compared to unadjusted benchmarks with these differences ranging from 10% to 47% higher in San Francisco, 12% to 20% higher in Maricopa, and 7% lower to 34% higher in Los Angeles counties. The time-of-day adjustment in San Francisco, limited to this region due to data availability, resulted in adjusted crash rates 2% lower to 16% higher than unadjusted rates, depending on severity level. The findings underscore the importance of adjusting for spatial and temporal confounders in benchmarking analysis, which ultimately contributes to a more equitable benchmark for ADS performance evaluations.
SegGrasp: Zero-Shot Task-Oriented Grasping via Semantic and Geometric Guided Segmentation
Task-oriented grasping, which involves grasping specific parts of objects based on their functions, is crucial for developing advanced robotic systems capable of performing complex tasks in dynamic environments. In this paper, we propose a training-free framework that incorporates both semantic and geometric priors for zero-shot task-oriented grasp generation. The proposed framework, SegGrasp, first leverages the vision-language models like GLIP for coarse segmentation. It then uses detailed geometric information from convex decomposition to improve segmentation quality through a fusion policy named GeoFusion. An effective grasp pose can be generated by a grasping network with improved segmentation. We conducted the experiments on both segmentation benchmark and real-world robot grasping. The experimental results show that SegGrasp surpasses the baseline by more than 15\% in grasp and segmentation performance.
comment: 7pages,6 figures
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Model-based reinforcement learning (RL) offers a solution to the data inefficiency that plagues most model-free RL algorithms. However, learning a robust world model often demands complex and deep architectures, which are expensive to compute and train. Within the world model, dynamics models are particularly crucial for accurate predictions, and various dynamics-model architectures have been explored, each with its own set of challenges. Currently, recurrent neural network (RNN) based world models face issues such as vanishing gradients and difficulty in capturing long-term dependencies effectively. In contrast, use of transformers suffers from the well-known issues of self-attention mechanisms, where both memory and computational complexity scale as $O(n^2)$, with $n$ representing the sequence length. To address these challenges we propose a state space model (SSM) based world model, specifically based on Mamba, that achieves $O(n)$ memory and computational complexity while effectively capturing long-term dependencies and facilitating the use of longer training sequences efficiently. We also introduce a novel sampling method to mitigate the suboptimality caused by an incorrect world model in the early stages of training, combining it with the aforementioned technique to achieve a normalised score comparable to other state-of-the-art model-based RL algorithms using only a 7 million trainable parameter world model. This model is accessible and can be trained on an off-the-shelf laptop. Our code is available at https://github.com/realwenlongwang/drama.git.
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
In interactive imitation learning (IL), uncertainty quantification offers a way for the learner (i.e. robot) to contend with distribution shifts encountered during deployment by actively seeking additional feedback from an expert (i.e. human) online. Prior works use mechanisms like ensemble disagreement or Monte Carlo dropout to quantify when black-box IL policies are uncertain; however, these approaches can lead to overconfident estimates when faced with deployment-time distribution shifts. Instead, we contend that we need uncertainty quantification algorithms that can leverage the expert human feedback received during deployment time to adapt the robot's uncertainty online. To tackle this, we draw upon online conformal prediction, a distribution-free method for constructing prediction intervals online given a stream of ground-truth labels. Human labels, however, are intermittent in the interactive IL setting. Thus, from the conformal prediction side, we introduce a novel uncertainty quantification algorithm called intermittent quantile tracking (IQT) that leverages a probabilistic model of intermittent labels, maintains asymptotic coverage guarantees, and empirically achieves desired coverage levels. From the interactive IL side, we develop ConformalDAgger, a new approach wherein the robot uses prediction intervals calibrated by IQT as a reliable measure of deployment-time uncertainty to actively query for more expert feedback. We compare ConformalDAgger to prior uncertainty-aware DAgger methods in scenarios where the distribution shift is (and isn't) present because of changes in the expert's policy. We find that in simulated and hardware deployments on a 7DOF robotic manipulator, ConformalDAgger detects high uncertainty when the expert shifts and increases the number of interventions compared to baselines, allowing the robot to more quickly learn the new behavior.
Learning Spatial Bimanual Action Models Based on Affordance Regions and Human Demonstrations
In this paper, we present a novel approach for learning bimanual manipulation actions from human demonstration by extracting spatial constraints between affordance regions, termed affordance constraints, of the objects involved. Affordance regions are defined as object parts that provide interaction possibilities to an agent. For example, the bottom of a bottle affords the object to be placed on a surface, while its spout affords the contained liquid to be poured. We propose a novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions. To exploit the information encoded in these spatial bimanual action models, we formulate an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics. We evaluate the approach in simulation with two example tasks (pouring drinks and rolling dough) and compare three different definitions of affordance constraints: (i) component-wise distances between affordance regions in Cartesian space, (ii) component-wise distances between affordance regions in cylindrical space, and (iii) degrees of satisfaction of manually defined symbolic spatial affordance constraints.
comment: 8 pages, accepted for publication at Humanoids 2024 - This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
SOLD: Reinforcement Learning with Slot Object-Centric Latent Dynamics
Learning a latent dynamics model provides a task-agnostic representation of an agent's understanding of its environment. Leveraging this knowledge for model-based reinforcement learning holds the potential to improve sample efficiency over model-free methods by learning inside imagined rollouts. Furthermore, because the latent space serves as input to behavior models, the informative representations learned by the world model facilitate efficient learning of desired skills. Most existing methods rely on holistic representations of the environment's state. In contrast, humans reason about objects and their interactions, forecasting how actions will affect specific parts of their surroundings. Inspired by this, we propose Slot-Attention for Object-centric Latent Dynamics (SOLD), a novel algorithm that learns object-centric dynamics models in an unsupervised manner from pixel inputs. We demonstrate that the structured latent space not only improves model interpretability but also provides a valuable input space for behavior models to reason over. Our results show that SOLD outperforms DreamerV3, a state-of-the-art model-based RL algorithm, across a range of benchmark robotic environments that evaluate for both relational reasoning and low-level manipulation capabilities. Videos are available at https://slot-latent-dynamics.github.io/.
DCNet: A Data-Driven Framework for DVL
Autonomous underwater vehicles (AUVs) are underwater robotic platforms used in a variety of applications. An AUV's navigation solution relies heavily on the fusion of inertial sensors and Doppler velocity logs (DVL), where the latter delivers accurate velocity updates. To ensure accurate navigation, a DVL calibration is undertaken before the mission begins to estimate its error terms. During calibration, the AUV follows a complex trajectory and employs nonlinear estimation filters to estimate error terms. In this paper, we introduce DCNet, a data-driven framework that utilizes a two-dimensional convolution kernel in an innovative way. Using DCNet and our proposed DVL error model, we offer a rapid calibration procedure. This can be applied to a trajectory with a nearly constant velocity. To train and test our proposed approach a dataset of 276 minutes long with real DVL recorded measurements was used. We demonstrated an average improvement of 70% in accuracy and 80% improvement in calibration time, compared to the baseline approach, with a low-performance DVL. As a result of those improvements, an AUV employing a low-cost DVL, can achieve higher accuracy, shorter calibration time, and apply a simple nearly constant velocity calibration trajectory. Our results also open up new applications for marine robotics utilizing low-cost, high-accurate DVLs.
comment: 10 Pages, 9 Figures, 5 Tables
MEMROC: Multi-Eye to Mobile RObot Calibration
This paper presents MEMROC (Multi-Eye to Mobile RObot Calibration), a novel motion-based calibration method that simplifies the process of accurately calibrating multiple cameras relative to a mobile robot's reference frame. MEMROC utilizes a known calibration pattern to facilitate accurate calibration with a lower number of images during the optimization process. Additionally, it leverages robust ground plane detection for comprehensive 6-DoF extrinsic calibration, overcoming a critical limitation of many existing methods that struggle to estimate the complete camera pose. The proposed method addresses the need for frequent recalibration in dynamic environments, where cameras may shift slightly or alter their positions due to daily usage, operational adjustments, or vibrations from mobile robot movements. MEMROC exhibits remarkable robustness to noisy odometry data, requiring minimal calibration input data. This combination makes it highly suitable for daily operations involving mobile robots. A comprehensive set of experiments on both synthetic and real data proves MEMROC's efficiency, surpassing existing state-of-the-art methods in terms of accuracy, robustness, and ease of use. To facilitate further research, we have made our code publicly available at https://github.com/davidea97/MEMROC.git.
VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model
Vision Language Models (VLMs) have recently been adopted in robotics for their capability in common sense reasoning and generalizability. Existing work has applied VLMs to generate task and motion planning from natural language instructions and simulate training data for robot learning. In this work, we explore using VLM to interpret human demonstration videos and generate robot task planning. Our method integrates keyframe selection, visual perception, and VLM reasoning into a pipeline. We named it SeeDo because it enables the VLM to ''see'' human demonstrations and explain the corresponding plans to the robot for it to ''do''. To validate our approach, we collected a set of long-horizon human videos demonstrating pick-and-place tasks in three diverse categories and designed a set of metrics to comprehensively benchmark SeeDo against several baselines, including state-of-the-art video-input VLMs. The experiments demonstrate SeeDo's superior performance. We further deployed the generated task plans in both a simulation environment and on a real robot arm.
Hybrid Filtering Heuristic for the Sensor-Placement Problem to Discretize 2D Continuous Environments
This paper addresses the sensor-placement problem (SPP) within the context of discretizing large, complex continuous 2D environments into graphs for efficient task-oriented route planning. The SPP aims to minimize the number of sensors required to achieve a user-defined coverage ratio while considering a general visibility model. We propose the hybrid filtering heuristic (HFH) framework, which enhances or combines outputs of existing sensor-placement methods, incorporating a filtering step. This step eliminates redundant sensors or those contributing marginally to the coverage, ensuring the coverage ratio remains within the desired interval. We implement two versions of HFH: the basic version and a variant, HFHB, incorporating a preprocessing technique known as bucketing to accelerate region clipping. We evaluate HFH and HFHB on a dataset of large, complex polygonal environments, comparing them to several baseline methods under both unlimited and limited-range omnidirectional visibility models. The results demonstrate that HFH and HFHB outperform baselines in terms of the number of sensors required to achieve the desired coverage ratio. Additionally, HFHB significantly reduces the runtime of more competitive baseline methods. We also adapt HFHB to a visibility model with localization uncertainty, demonstrating its effectiveness up to a certain level of uncertainty.
comment: 16 pages, 33 figures (including subfigures); submitted to the IEEE Transactions on Robotics (T-RO); associated repository: https://github.com/janmikulacz/spp
Optimizing NeRF-based SLAM with Trajectory Smoothness Constraints
The joint optimization of Neural Radiance Fields (NeRF) and camera trajectories has been widely applied in SLAM tasks due to its superior dense mapping quality and consistency. NeRF-based SLAM learns camera poses using constraints by implicit map representation. A widely observed phenomenon that results from the constraints of this form is jerky and physically unrealistic estimated camera motion, which in turn affects the map quality. To address this deficiency of current NeRF-based SLAM, we propose in this paper TS-SLAM (TS for Trajectory Smoothness). It introduces smoothness constraints on camera trajectories by representing them with uniform cubic B-splines with continuous acceleration that guarantees smooth camera motion. Benefiting from the differentiability and local control properties of B-splines, TS-SLAM can incrementally learn the control points end-to-end using a sliding window paradigm. Additionally, we regularize camera trajectories by exploiting the dynamics prior to further smooth trajectories. Experimental results demonstrate that TS-SLAM achieves superior trajectory accuracy and improves mapping quality versus NeRF-based SLAM that does not employ the above smoothness constraints.
TřiVis: Versatile, Reliable, and High-Performance Tool for Computing Visibility in Polygonal Environments IROS
Visibility is a fundamental concept in computational geometry, with numerous applications in robotics, surveillance systems, video games, and other fields. This software paper presents T\v{r}iVis, a C++ library developed by the authors for computing numerous visibility-related queries in highly complex polygonal environments. Adapting the triangular expansion algorithm (TEA), T\v{r}iVis stands out as a versatile, high-performance, more reliable and easy-to-use alternative to current solutions that is also free of heavy dependencies. Through evaluation on a challenging dataset, T\v{r}iVis has been benchmarked against existing visibility libraries. The results demonstrate that T\v{r}iVis outperforms the competing solutions by at least an order of magnitude in query times, while exhibiting more reliable runtime behavior. T\v{r}iVis is freely available for private, research, and institutional use at https://github.com/janmikulacz/trivis.
comment: 8 pages, 12 figures (including subfigures); submitted to the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); associated repository: https://github.com/janmikulacz/trivis
Bio-inspired reconfigurable stereo vision for robotics using omnidirectional cameras ICRA 2025
This work introduces a novel bio-inspired reconfigurable stereo vision system for robotics, leveraging omnidirectional cameras and a novel algorithm to achieve flexible visual capabilities. Inspired by the adaptive vision of various species, our visual system addresses traditional stereo vision limitations, i.e., immutable camera alignment with narrow fields of view, by introducing a reconfigurable stereo vision system to robotics. Our key innovations include the reconfigurable stereo vision strategy that allows dynamic camera alignment, a robust depth measurement system utilizing a nonrectified geometrical method combined with a deep neural network for feature matching, and a geometrical compensation technique to enhance visual accuracy. Implemented on a metamorphic robot, this vision system demonstrates its great adaptability to various scenarios by switching its configurations of 316{\deg} monocular with 79{\deg} binocular field for fast target seeking and 242{\deg} monocular with 150{\deg} binocular field for detailed close inspection.
comment: 7 pages, 8 figures, submitted to IEEE ICRA 2025
SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Predicting the future motion of surrounding agents is essential for autonomous vehicles (AVs) to operate safely in dynamic, human-robot-mixed environments. However, the scarcity of large-scale driving datasets has hindered the development of robust and generalizable motion prediction models, limiting their ability to capture complex interactions and road geometries. Inspired by recent advances in natural language processing (NLP) and computer vision (CV), self-supervised learning (SSL) has gained significant attention in the motion prediction community for learning rich and transferable scene representations. Nonetheless, existing pre-training methods for motion prediction have largely focused on specific model architectures and single dataset, limiting their scalability and generalizability. To address these challenges, we propose SmartPretrain, a general and scalable SSL framework for motion prediction that is both model-agnostic and dataset-agnostic. Our approach integrates contrastive and reconstructive SSL, leveraging the strengths of both generative and discriminative paradigms to effectively represent spatiotemporal evolution and interactions without imposing architectural constraints. Additionally, SmartPretrain employs a dataset-agnostic scenario sampling strategy that integrates multiple datasets, enhancing data volume, diversity, and robustness. Extensive experiments on multiple datasets demonstrate that SmartPretrain consistently improves the performance of state-of-the-art prediction models across datasets, data splits and main metrics. For instance, SmartPretrain significantly reduces the MissRate of Forecast-MAE by 10.6%. These results highlight SmartPretrain's effectiveness as a unified, scalable solution for motion prediction, breaking free from the limitations of the small-data regime. Codes are available at https://github.com/youngzhou1999/SmartPretrain
comment: 11 pages, 5 figures
FRASA: An End-to-End Reinforcement Learning Agent for Fall Recovery and Stand Up of Humanoid Robots
Humanoid robotics faces significant challenges in achieving stable locomotion and recovering from falls in dynamic environments. Traditional methods, such as Model Predictive Control (MPC) and Key Frame Based (KFB) routines, either require extensive fine-tuning or lack real-time adaptability. This paper introduces FRASA, a Deep Reinforcement Learning (DRL) agent that integrates fall recovery and stand up strategies into a unified framework. Leveraging the Cross-Q algorithm, FRASA significantly reduces training time and offers a versatile recovery strategy that adapts to unpredictable disturbances. Comparative tests on Sigmaban humanoid robots demonstrate FRASA superior performance against the KFB method deployed in the RoboCup 2023 by the Rhoban Team, world champion of the KidSize League.
From gymnastics to virtual nonholonomic constraints: energy injection, dissipation, and regulation for the acrobot
In this article we study virtual nonholonomic constraints, which are relations between the generalized coordinates and momenta of a mechanical system that can be enforced via feedback control. We design a constraint which emulates gymnastics giant motion in an acrobot, and prove that this constraint can inject or dissipate energy based on the sign of a design parameter. The proposed constraint is tested both in simulation and experimentally on a real-world acrobot, demonstrating highly effective energy regulation properties and robustness to a variety of disturbances.
Extended Friction Models for the Physics Simulation of Servo Actuators
Accurate physical simulation is crucial for the development and validation of control algorithms in robotic systems. Recent works in Reinforcement Learning (RL) take notably advantage of extensive simulations to produce efficient robot control. State-of-the-art servo actuator models generally fail at capturing the complex friction dynamics of these systems. This limits the transferability of simulated behaviors to real-world applications. In this work, we present extended friction models that allow to more accurately simulate servo actuator dynamics. We propose a comprehensive analysis of various friction models, present a method for identifying model parameters using recorded trajectories from a pendulum test bench, and demonstrate how these models can be integrated into physics engines. The proposed friction models are validated on four distinct servo actuators and tested on 2R manipulators, showing significant improvements in accuracy over the standard Coulomb-Viscous model. Our results highlight the importance of considering advanced friction effects in the simulation of servo actuators to enhance the realism and reliability of robotic simulations.
Making a Mess and Getting Away with it: Traveling Salesperson Problem with Circle Placement for Dubins Vehicles
This paper explores a variation of the Traveling Salesperson Problem, where the agent places a circular obstacle next to each node once it visits it. Referred to as the Traveling Salesperson Problem with Circle Placement (TSP-CP), the aim is to maximize the obstacle radius for which a valid closed tour exists and then minimize the tour cost. The TSP-CP finds relevance in various real-world applications, such as harvesting, quarrying, and open-pit mining. We propose several novel solvers to address the TSP-CP, its variant tailored for Dubins vehicles, and a crucial subproblem known as the Traveling Salesperson Problem on self-deleting graphs (TSP-SD). Our extensive experimental results show that the proposed solvers outperform the current state-of-the-art on related problems in solution quality.
comment: 8 pages, 7 figures, submitted to IEEE Robotics and Automation Letters
Data-driven Feedback Control of Lattice Structures with Localized Actuation and Sensing
Assembling lattices from discrete building blocks enables the composition of large, heterogeneous, and easily reconfigurable objects with desirable mass-to-stiffness ratios. This type of building system may also be referred to as a digital material, as it is constituted from discrete, error-correcting components. Researchers have demonstrated various active structures and even robotic systems that take advantage of the reconfigurable, mass-efficient properties of discrete lattice structures. However, the existing literature has predominantly used open-loop control strategies, limiting the performance of the presented systems. In this paper, we present a novel approach to feedback control of digital lattice structures, leveraging real-time measurements of the system dynamics. We introduce an actuated voxel which constitutes a novel means for actuation of lattice structures. Our control method is based on the Extended Dynamical Mode Decomposition algorithm in conjunction with the Linear Quadratic Regulator and the Koopman Model Predictive Control. The key advantage of our approach lies in its purely data-driven nature, without the need for any prior knowledge of a system's structure. We illustrate the developed method via real experiments with custom-built flexible lattice beam, showing its ability to accomplish various tasks even with minimal sensing and actuation resources. In particular, we address two problems: stabilization together with disturbance attenuation, and reference tracking.
TactileAR: Active Tactile Pattern Reconstruction ICRA 2024
High-resolution (HR) contact surface information is essential for robotic grasping and precise manipulation tasks. However, it remains a challenge for current taxel-based sensors to obtain HR tactile information. In this paper, we focus on utilizing low-resolution (LR) tactile sensors to reconstruct the localized, dense, and HR representation of contact surfaces. In particular, we build a Gaussian triaxial tactile sensor degradation model and propose a tactile pattern reconstruction framework based on the Kalman filter. This framework enables the reconstruction of 2-D HR contact surface shapes using collected LR tactile sequences. In addition, we present an active exploration strategy to enhance the reconstruction efficiency. We evaluate the proposed method in real-world scenarios with comparison to existing prior-information-based approaches. Experimental results confirm the efficiency of the proposed approach and demonstrate satisfactory reconstructions of complex contact surface shapes. Code: https://github.com/wmtlab/tactileAR
comment: accepted by ICRA 2024
Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking
Automatic Emergency Braking (AEB) systems are a crucial component in ensuring the safety of passengers in autonomous vehicles. Conventional AEB systems primarily rely on closed-set perception modules to recognize traffic conditions and assess collision risks. To enhance the adaptability of AEB systems in open scenarios, we propose Dual-AEB, a system combines an advanced multimodal large language model (MLLM) for comprehensive scene understanding and a conventional rule-based rapid AEB to ensure quick response times. To the best of our knowledge, Dual-AEB is the first method to incorporate MLLMs within AEB systems. Through extensive experimentation, we have validated the effectiveness of our method. The source code will be available at https://github.com/ChipsICU/Dual-AEB.
Energy-Cautious Designation of Kinematic Parameters for a Sustainable Parallel-Serial Heavy-Duty Manipulator Driven by Electromechanical Linear Actuator
Electrification, a key strategy in combating climate change, is transforming industries, and off-highway machines (OHM) will be next to transition from combustion engines and hydraulic actuation to sustainable fully electrified machines. Electromechanical linear actuators (EMLAs) offer superior efficiency, safety, and reduced maintenance, and they unlock vast potential for high-performance autonomous operations. However, a key challenge lies in optimizing the kinematic parameters of OHMs' on-board manipulators for EMLA integration to exploit the full capabilities of actuation systems and maximize their performance. This work addresses this challenge by delving into the structural optimization of a prevalent closed kinematic chain configuration commonly employed in OHM manipulators. Our approach aims to retain the manipulator's existing capabilities while reducing its energy expenditure, paving the way for a greener future in industrial automation, one in which sustainable and high-performing robotized OHMs can evolve. The feasibility of our methodology is validated through simulation results obtained on a commercially available parallel-serial heavy-duty manipulator mounted on a battery electric vehicle. The results demonstrate the efficacy of our approach in modifying kinematic parameters to facilitate the replacement of conventional hydraulic actuators with EMLAs, all while minimizing the overall energy consumption of the system.
comment: This work is accepted for presentation at IEEE VTC 2024-Washington USA
Enhanced Robot Planning and Perception through Environment Prediction
Mobile robots rely on maps to navigate through an environment. In the absence of any map, the robots must build the map online from partial observations as they move in the environment. Traditional methods build a map using only direct observations. In contrast, humans identify patterns in the observed environment and make informed guesses about what to expect ahead. Modeling these patterns explicitly is difficult due to the complexity of the environments. However, these complex models can be approximated well using learning-based methods in conjunction with large training data. By extracting patterns, robots can use direct observations and predictions of what lies ahead to better navigate an unknown environment. In this dissertation, we present several learning-based methods to equip mobile robots with prediction capabilities for efficient and safer operation. In the first part of the dissertation, we learn to predict using geometrical and structural patterns in the environment. Partially observed maps provide invaluable cues for accurately predicting the unobserved areas. We first demonstrate the capability of general learning-based approaches to model these patterns for a variety of overhead map modalities. Then we employ task-specific learning for faster navigation in indoor environments by predicting 2D occupancy in the nearby regions. This idea is further extended to 3D point cloud representation for object reconstruction. Predicting the shape of the full object from only partial views, our approach paves the way for efficient next-best-view planning. In the second part of the dissertation, we learn to predict using spatiotemporal patterns in the environment. We focus on dynamic tasks such as target tracking and coverage where we seek decentralized coordination between robots. We first show how graph neural networks can be used for more scalable and faster inference.
comment: 289 pages, 81 figures, 16 tables; Dissertation submitted to UMD to fulfill PhD requirement
Decentralized Uncertainty-Aware Active Search with a Team of Aerial Robots
Rapid search and rescue is critical to maximizing survival rates following natural disasters. However, these efforts are challenged by the need to search large disaster zones, lack of reliability in the communications infrastructure, and a priori unknown numbers of objects of interest (OOIs), such as injured survivors. Aerial robots are increasingly being deployed for search and rescue due to their high mobility, but there remains a gap in deploying multi-robot autonomous aerial systems for methodical search of large environments. Prior works have relied on preprogrammed paths from human operators or are evaluated only in simulation. We bridge these gaps in the state of the art by developing and demonstrating a decentralized active search system, which biases its trajectories to take additional views of uncertain OOIs. The methodology leverages stochasticity for rapid coverage in communication denied scenarios. When communications are available, robots share poses, goals, and OOI information to accelerate the rate of search. Extensive simulations and hardware experiments in Bloomingdale, OH, are conducted to validate the approach. The results demonstrate the active search approach outperforms greedy coverage-based planning in communication-denied scenarios while maintaining comparable performance in communication-enabled scenarios.
CoHRT: A Collaboration System for Human-Robot Teamwork
Collaborative robots are increasingly deployed alongside humans in factories, hospitals, schools, and other domains to enhance teamwork and efficiency. Systems that seamlessly integrate humans and robots into cohesive teams for coordinated and efficient task execution are needed, enabling studies on how robot collaboration policies affect team performance and teammates' perceived fairness, trust, and safety. Such a system can also be utilized to study the impact of a robot's normative behavior on team collaboration. Additionally, it allows for investigation into how the legibility and predictability of robot actions affect human-robot teamwork and perceived safety and trust. Existing systems are limited, typically involving one human and one robot, and thus require more insight into broader team dynamics. Many rely on games or virtual simulations, neglecting the impact of a robot's physical presence. Most tasks are turn-based, hindering simultaneous execution and affecting efficiency. This paper introduces CoHRT (Collaboration System for Human-Robot Teamwork), which facilitates multi-human-robot teamwork through seamless collaboration, coordination, and communication. CoHRT utilizes a server-client-based architecture, a vision-based system to track task environments, and a simple interface for team action coordination. It allows for the design of tasks considering the human teammates' physical and mental workload and varied skill labels across the team members. We used CoHRT to design a collaborative block manipulation and jigsaw puzzle-solving task in a team of one Franka Emika Panda robot and two humans. The system enables recording multi-modal collaboration data to develop adaptive collaboration policies for robots. To further utilize CoHRT, we outline potential research directions in diverse human-robot collaborative tasks.
comment: 8 Pages, Robotics Science and Systems (RSS), Safety and Normative Behaviors in Human-Robot Interaction Workshop 2024 (accepted), https://sites.google.com/view/safe-hri/accepted-papers
Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning ICRA 2025
Aerial Vision-and-Language Navigation (VLN) is a novel task enabling Unmanned Aerial Vehicles (UAVs) to navigate in outdoor environments through natural language instructions and visual cues. It remains challenging due to the complex spatial relationships in outdoor aerial scenes. In this paper, we propose an end-to-end zero-shot framework for aerial VLN tasks, where the large language model (LLM) is introduced as our agent for action prediction. Specifically, we develop a novel Semantic-Topo-Metric Representation (STMR) to enhance the spatial reasoning ability of LLMs. This is achieved by extracting and projecting instruction-related semantic masks of landmarks into a top-down map that contains the location information of surrounding landmarks. Further, this map is transformed into a matrix representation with distance metrics as the text prompt to the LLM, for action prediction according to the instruction. Experiments conducted in real and simulation environments have successfully proved the effectiveness and robustness of our method, achieving 15.9% and 12.5% improvements (absolute) in Oracle Success Rate (OSR) on AerialVLN-S dataset.
comment: Submitted to ICRA 2025
A Systematic Review of Edge Case Detection in Automated Driving: Methods, Challenges and Future Directions
The rapid development of automated vehicles (AVs) promises to revolutionize transportation by enhancing safety and efficiency. However, ensuring their reliability in diverse real-world conditions remains a significant challenge, particularly due to rare and unexpected situations known as edge cases. Although numerous approaches exist for detecting edge cases, there is a notable lack of a comprehensive survey that systematically reviews these techniques. This paper fills this gap by presenting a practical, hierarchical review and systematic classification of edge case detection and assessment methodologies. Our classification is structured on two levels: first, categorizing detection approaches according to AV modules, including perception-related and trajectory-related edge cases; and second, based on underlying methodologies and theories guiding these techniques. We extend this taxonomy by introducing a new class called "knowledge-driven" approaches, which is largely overlooked in the literature. Additionally, we review the techniques and metrics for the evaluation of edge case detection methods and identified edge cases. To our knowledge, this is the first survey to comprehensively cover edge case detection methods across all AV subsystems, discuss knowledge-driven edge cases, and explore evaluation techniques for detection methods. This structured and multi-faceted analysis aims to facilitate targeted research and modular testing of AVs. Moreover, by identifying the strengths and weaknesses of various approaches and discussing the challenges and future directions, this survey intends to assist AV developers, researchers, and policymakers in enhancing the safety and reliability of automated driving (AD) systems through effective edge case detection.
comment: Preprint submitted to IEEE Transactions on Intelligent Transportation Systems
ARCap: Collecting High-quality Human Demonstrations for Robot Learning with Augmented Reality Feedback ICRA 2025
Recent progress in imitation learning from human demonstrations has shown promising results in teaching robots manipulation skills. To further scale up training datasets, recent works start to use portable data collection devices without the need for physical robot hardware. However, due to the absence of on-robot feedback during data collection, the data quality depends heavily on user expertise, and many devices are limited to specific robot embodiments. We propose ARCap, a portable data collection system that provides visual feedback through augmented reality (AR) and haptic warnings to guide users in collecting high-quality demonstrations. Through extensive user studies, we show that ARCap enables novice users to collect robot-executable data that matches robot kinematics and avoids collisions with the scenes. With data collected from ARCap, robots can perform challenging tasks, such as manipulation in cluttered environments and long-horizon cross-embodiment manipulation. ARCap is fully open-source and easy to calibrate; all components are built from off-the-shelf products. More details and results can be found on our website: https://stanford-tml.github.io/ARCap
comment: 8 pages, 8 Figures, submitted to ICRA 2025
AdvDiffuser: Generating Adversarial Safety-Critical Driving Scenarios via Guided Diffusion
Safety-critical scenarios are infrequent in natural driving environments but hold significant importance for the training and testing of autonomous driving systems. The prevailing approach involves generating safety-critical scenarios automatically in simulation by introducing adversarial adjustments to natural environments. These adjustments are often tailored to specific tested systems, thereby disregarding their transferability across different systems. In this paper, we propose AdvDiffuser, an adversarial framework for generating safety-critical driving scenarios through guided diffusion. By incorporating a diffusion model to capture plausible collective behaviors of background vehicles and a lightweight guide model to effectively handle adversarial scenarios, AdvDiffuser facilitates transferability. Experimental results on the nuScenes dataset demonstrate that AdvDiffuser, trained on offline driving logs, can be applied to various tested systems with minimal warm-up episode data and outperform other existing methods in terms of realism, diversity, and adversarial performance.
Motion Planning for Object Manipulation by Edge-Rolling IROS 2024
A common way to manipulate heavy objects is to maintain at least one point of the object in contact with the environment during the manipulation. When the object has a cylindrical shape or, in general, a curved edge, not only sliding and pivoting motions but also rolling the object along the edge can effectively satisfy this condition. Edge-rolling offers several advantages in terms of efficiency and maneuverability. This paper aims to develop a novel approach for approximating the prehensile edge-rolling motion on any path by a sequence of constant screw displacements, leveraging the principles of screw theory. Based on this approach, we proposed an algorithmic method for task-space-based path generation of object manipulation between two given configurations using a sequence of rolling and pivoting motions. The method is based on an optimization algorithm that takes into account the joint limitations of the robot. To validate our approach, we conducted experiments to manipulate a cylinder along linear and curved paths using the Franka Emika Panda manipulator.
comment: 8 pages, Pre-print, Submitted to IROS 2024
EasyHeC++: Fully Automatic Hand-Eye Calibration with Pretrained Image Models IROS 2024
Hand-eye calibration plays a fundamental role in robotics by directly influencing the efficiency of critical operations such as manipulation and grasping. In this work, we present a novel framework, EasyHeC++, designed for fully automatic hand-eye calibration. In contrast to previous methods that necessitate manual calibration, specialized markers, or the training of arm-specific neural networks, our approach is the first system that enables accurate calibration of any robot arm in a marker-free, training-free, and fully automatic manner. Our approach employs a two-step process. First, we initialize the camera pose using a sampling or feature-matching-based method with the aid of pretrained image models. Subsequently, we perform pose optimization through differentiable rendering. Extensive experiments demonstrate the system's superior accuracy in both synthetic and real-world datasets across various robot arms and camera settings. Project page: https://ootts.github.io/easyhec_plus.
comment: Accepted by IROS 2024
Language-Model-Assisted Bi-Level Programming for Reward Learning from Internet Videos
Learning from Demonstrations, particularly from biological experts like humans and animals, often encounters significant data acquisition challenges. While recent approaches leverage internet videos for learning, they require complex, task-specific pipelines to extract and retarget motion data for the agent. In this work, we introduce a language-model-assisted bi-level programming framework that enables a reinforcement learning agent to directly learn its reward from internet videos, bypassing dedicated data preparation. The framework includes two levels: an upper level where a vision-language model (VLM) provides feedback by comparing the learner's behavior with expert videos, and a lower level where a large language model (LLM) translates this feedback into reward updates. The VLM and LLM collaborate within this bi-level framework, using a "chain rule" approach to derive a valid search direction for reward learning. We validate the method for reward learning from YouTube videos, and the results have shown that the proposed method enables efficient reward design from expert videos of biological agents for complex behavior synthesis.
Articulated Animal AI: An Environment for Animal-like Cognition in a Limbed Agent NeurIPS 2024
This paper presents the Articulated Animal AI Environment for Animal Cognition, an enhanced version of the previous AnimalAI Environment. Key improvements include the addition of agent limbs, enabling more complex behaviors and interactions with the environment that closely resemble real animal movements. The testbench features an integrated curriculum training sequence and evaluation tools, eliminating the need for users to develop their own training programs. Additionally, the tests and training procedures are randomized, which will improve the agent's generalization capabilities. These advancements significantly expand upon the original AnimalAI framework and will be used to evaluate agents on various aspects of animal cognition.
comment: 8 pages, accepted to Workshop on Open-World Agents (OWA-2024) at NeurIPS 2024 in Vancouver, Canada
Failure Prediction from Limited Hardware Demonstrations
Prediction of failures in real-world robotic systems either requires accurate model information or extensive testing. Partial knowledge of the system model makes simulation-based failure prediction unreliable. Moreover, obtaining such demonstrations is expensive, and could potentially be risky for the robotic system to repeatedly fail during data collection. This work presents a novel three-step methodology for discovering failures that occur in the true system by using a combination of a limited number of demonstrations from the true system and the failure information processed through sampling-based testing of a model dynamical system. Given a limited budget $N$ of demonstrations from true system and a model dynamics (with potentially large modeling errors), the proposed methodology comprises of a) exhaustive simulations for discovering algorithmic failures using the model dynamics; b) design of initial $N_1$ demonstrations of the true system using Bayesian inference to learn a Gaussian process regression (GPR)-based failure predictor; and c) iterative $N - N_1$ demonstrations of the true system for updating the failure predictor. To illustrate the efficacy of the proposed methodology, we consider: a) the failure discovery for the task of pushing a T block to a fixed target region with UR3E collaborative robot arm using a diffusion policy; and b) the failure discovery for an F1-Tenth racing car tracking a given raceline under an LQR control policy.
comment: 8 pages, 7 figures
iFANnpp: Nuclear Power Plant Digital Twin for Robots and Autonomous Intelligence
Robotics has gained significant attention due to its autonomy and ability to automate in the nuclear industry. However, the increasing complexity of robots has led to a growing demand for advanced simulation and control methods to predict robot behavior and optimize plant performance. Most existing digital twins only address parts of systems and do not offer an overall design of nuclear power plants. Furthermore, they are often designed for specific algorithms or tasks, making them unsuitable for broader research applications or other potential projects. In response, we propose a comprehensive nuclear power plant designed to enhance real-time monitoring, operational efficiency, and predictive maintenance. We selected to model a full-scope nuclear power plant in Unreal Engine 5 to incorporate the complexities and various phenomena. The high-resolution simulation environment is integrated with a General Pressurized Water Reactor Simulator, a high-fidelity physics-driven software, to create a realistic flow of nuclear power plant and a real-time updating virtual environment. Furthermore, the virtual environment provides various features and a Python bridge for researchers to test custom algorithms and frameworks easily. The digital twin's performance is presented, and several research ideas - such as multi-robot task scheduling and robot navigation in the radiation area - using implemented features are presented.
comment: 12 pages, 9 figures
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models
Traditionally, model-based reinforcement learning (MBRL) methods exploit neural networks as flexible function approximators to represent a priori unknown environment dynamics. However, training data are typically scarce in practice, and these black-box models often fail to generalize. Modeling architectures that leverage known physics can substantially reduce the complexity of system-identification, but break down in the face of complex phenomena such as contact. We introduce a novel framework for learning semi-structured dynamics models for contact-rich systems which seamlessly integrates structured first principles modeling techniques with black-box auto-regressive models. Specifically, we develop an ensemble of probabilistic models to estimate external forces, conditioned on historical observations and actions, and integrate these predictions using known Lagrangian dynamics. With this semi-structured approach, we can make accurate long-horizon predictions with substantially less data than prior methods. We leverage this capability and propose Semi-Structured Reinforcement Learning (SSRL) a simple model-based learning framework which pushes the sample complexity boundary for real-world learning. We validate our approach on a real-world Unitree Go1 quadruped robot, learning dynamic gaits -- from scratch -- on both hard and soft surfaces with just a few minutes of real-world data. Video and code are available at: https://sites.google.com/utexas.edu/ssrl
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
In this paper, we introduce SPA, a novel representation learning framework that emphasizes the importance of 3D spatial awareness in embodied AI. Our approach leverages differentiable neural rendering on multi-view images to endow a vanilla Vision Transformer (ViT) with intrinsic spatial understanding. We present the most comprehensive evaluation of embodied representation learning to date, covering 268 tasks across 8 simulators with diverse policies in both single-task and language-conditioned multi-task scenarios. The results are compelling: SPA consistently outperforms more than 10 state-of-the-art representation methods, including those specifically designed for embodied AI, vision-centric tasks, and multi-modal applications, while using less training data. Furthermore, we conduct a series of real-world experiments to confirm its effectiveness in practical scenarios. These results highlight the critical role of 3D spatial awareness for embodied representation learning. Our strongest model takes more than 6000 GPU hours to train and we are committed to open-sourcing all code and model weights to foster future research in embodied representation learning. Project Page: https://haoyizhu.github.io/spa/.
comment: Project Page: https://haoyizhu.github.io/spa/
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
The increasing demand for versatile robotic systems to operate in diverse and dynamic environments has emphasized the importance of a generalist policy, which leverages a large cross-embodiment data corpus to facilitate broad adaptability and high-level reasoning. However, the generalist would struggle with inefficient inference and cost-expensive training. The specialist policy, instead, is curated for specific domain data and excels at task-level precision with efficiency. Yet, it lacks the generalization capacity for a wide range of applications. Inspired by these observations, we introduce RoboDual, a synergistic dual-system that supplements the merits of both generalist and specialist policy. A diffusion transformer-based specialist is devised for multi-step action rollouts, exquisitely conditioned on the high-level task understanding and discretized action output of a vision-language-action (VLA) based generalist. Compared to OpenVLA, RoboDual achieves 26.7% improvement in real-world setting and 12% gain on CALVIN by introducing a specialist policy with merely 20M trainable parameters. It maintains strong performance with 5% of demonstration data only, and enables a 3.8 times higher control frequency in real-world deployment. Code would be made publicly available. Our project page is hosted at: https://opendrivelab.com/RoboDual/
comment: Project page: https://opendrivelab.com/RoboDual/
From CAD to URDF: Co-Design of a Jet-Powered Humanoid Robot Including CAD Geometry IROS 2024
Co-design optimization strategies usually rely on simplified robot models extracted from CAD. While these models are useful for optimizing geometrical and inertial parameters for robot control, they might overlook important details essential for prototyping the optimized mechanical design. For instance, they may not account for mechanical stresses exerted on the optimized geometries and the complexity of assembly-level design. In this paper, we introduce a co-design framework aimed at improving both the control performance and mechanical design of our robot. Specifically, we identify the robot links that significantly influence control performance. The geometric characteristics of these links are parameterized and optimized using a multi-objective evolutionary algorithm to achieve optimal control performance. Additionally, an automated Finite Element Method (FEM) analysis is integrated into the framework to filter solutions not satisfying the required structural safety margin. We validate the framework by applying it to enhance the mechanical design for flight performance of the jet-powered humanoid robot iRonCub.
comment: IROS 2024
Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning
Soft robots have the potential to revolutionize the use of robotic systems with their capability of establishing safe, robust, and adaptable interactions with their environment, but their precise control remains challenging. In contrast, traditional rigid robots offer high accuracy and repeatability but lack the flexibility of soft robots. We argue that combining these characteristics in a hybrid robotic platform can significantly enhance overall capabilities. This work presents a novel hybrid robotic platform that integrates a rigid manipulator with a fully developed soft arm. This system is equipped with the intelligence necessary to perform flexible and generalizable tasks through imitation learning autonomously. The physical softness and machine learning enable our platform to achieve highly generalizable skills, while the rigid components ensure precision and repeatability.
comment: Corrected missing citation
Lean Methodology for Garment Modernization
Lean Methodology for Garment Modernization. This article presents the lean methodology for modernizing garment manufacturing, focusing on lean thinking, lean practices, automation development, VSM, and CRP, and how to integrate them effectively. While isolated automation of specific operations can improve efficiency and reduce cycle time, it does not necessarily enhance overall garment output and efficiency. To achieve these broader improvements, it is essential to consider the entire production line and process using VSM and CRP to optimize production and center balance. This approach can increase efficiency, and reduce manufacturing costs, labor time, and lead time, ultimately adding value to the company and factory.
comment: 11 pages,7 Figures
Simplified POMDP Planning with an Alternative Observation Space and Formal Performance Guarantees
Online planning under uncertainty in partially observable domains is an essential capability in robotics and AI. The partially observable Markov decision process (POMDP) is a mathematically principled framework for addressing decision-making problems in this challenging setting. However, finding an optimal solution for POMDPs is computationally expensive and is feasible only for small problems. In this work, we contribute a novel method to simplify POMDPs by switching to an alternative, more compact, observation space and simplified model to speedup planning with formal performance guarantees. We introduce the notion of belief tree topology, which encodes the levels and branches in the tree that use the original and alternative observation space and models. Each belief tree topology comes with its own policy space and planning performance. Our key contribution is to derive bounds between the optimal Q-function of the original POMDP and the simplified tree defined by a given topology with a corresponding simplified policy space. These bounds are then used as an adaptation mechanism between different tree topologies until the optimal action of the original POMDP can be determined. Further, we consider a specific instantiation of our framework, where the alternative observation space and model correspond to a setting where the state is fully observable. We evaluate our approach in simulation, considering exact and approximate POMDP solvers and demonstrating a significant speedup while preserving solution quality. We believe this work opens new exciting avenues for online POMDP planning with formal performance guarantees.
comment: Accepted to ISRR 2024
ForceMimic: Force-Centric Imitation Learning with Force-Motion Capture System for Contact-Rich Manipulation ICRA 2025
In most contact-rich manipulation tasks, humans apply time-varying forces to the target object, compensating for inaccuracies in the vision-guided hand trajectory. However, current robot learning algorithms primarily focus on trajectory-based policy, with limited attention given to learning force-related skills. To address this limitation, we introduce ForceMimic, a force-centric robot learning system, providing a natural, force-aware and robot-free robotic demonstration collection system, along with a hybrid force-motion imitation learning algorithm for robust contact-rich manipulation. Using the proposed ForceCapture system, an operator can peel a zucchini in 5 minutes, while force-feedback teleoperation takes over 13 minutes and struggles with task completion. With the collected data, we propose HybridIL to train a force-centric imitation learning model, equipped with hybrid force-position control primitive to fit the predicted wrench-position parameters during robot execution. Experiments demonstrate that our approach enables the model to learn a more robust policy under the contact-rich task of vegetable peeling, increasing the success rates by 54.5% relatively compared to state-of-the-art pure-vision-based imitation learning. Hardware, code, data and more results would be open-sourced on the project website at https://forcemimic.github.io.
comment: 8 pages, 7 figures, submitted to ICRA 2025, project website at https://forcemimic.github.io
NeRF-Accelerated Ecological Monitoring in Mixed-Evergreen Redwood Forest
Forest mapping provides critical observational data needed to understand the dynamics of forest environments. Notably, tree diameter at breast height (DBH) is a metric used to estimate forest biomass and carbon dioxide sequestration. Manual methods of forest mapping are labor intensive and time consuming, a bottleneck for large-scale mapping efforts. Automated mapping relies on acquiring dense forest reconstructions, typically in the form of point clouds. Terrestrial laser scanning (TLS) and mobile laser scanning (MLS) generate point clouds using expensive LiDAR sensing, and have been used successfully to estimate tree diameter. Neural radiance fields (NeRFs) are an emergent technology enabling photorealistic, vision-based reconstruction by training a neural network on a sparse set of input views. In this paper, we present a comparison of MLS and NeRF forest reconstructions for the purpose of trunk diameter estimation in a mixed-evergreen Redwood forest. In addition, we propose an improved DBH-estimation method using convex-hull modeling. Using this approach, we achieved 1.68 cm RMSE, which consistently outperformed standard cylinder modeling approaches. Our code contributions and forest datasets are freely available at https://github.com/harelab-ucsc/RedwoodNeRF.
Concurrent-Learning Based Relative Localization in Shape Formation of Robot Swarms
In this paper, we address the shape formation problem for massive robot swarms in environments where external localization systems are unavailable. Achieving this task effectively with solely onboard measurements is still scarcely explored and faces some practical challenges. To solve this challenging problem, we propose the following novel results. Firstly, to estimate the relative positions among neighboring robots, a concurrent-learning based estimator is proposed. It relaxes the persistent excitation condition required in the classical ones such as least-square estimator. Secondly, we introduce a finite-time agreement protocol to determine the shape location. This is achieved by estimating the relative position between each robot and a randomly assigned seed robot. The initial position of the seed one marks the shape location. Thirdly, based on the theoretical results of the relative localization, a novel behavior-based control strategy is devised. This strategy not only enables adaptive shape formation of large group of robots but also enhances the observability of inter-robot relative localization. Numerical simulation results are provided to verify the performance of our proposed strategy compared to the state-of-the-art ones. Additionally, outdoor experiments on real robots further demonstrate the practical effectiveness and robustness of our methods.
xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing
Reusing pre-collected data from different domains is an appealing solution for decision-making tasks that have insufficient data in the target domain but are relatively abundant in other related domains. Existing cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning, such as learning domain/task-specific discriminators, representations, or policies. This design philosophy often results in heavy model architectures or task/domain-specific modeling, lacking flexibility. This reality makes us wonder: can we directly bridge the domain gaps universally at the data level, instead of relying on complex downstream cross-domain policy transfer models? In this study, we propose the Cross-Domain Trajectory EDiting (xTED) framework that employs a specially designed diffusion model for cross-domain trajectory adaptation. Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data. By utilizing the pre-trained diffusion as a prior, source domain trajectories can be transformed to match with target domain properties while preserving original semantic information. This process implicitly corrects underlying domain gaps, enhancing state realism and dynamics reliability in the source data, and allowing flexible incorporation with various downstream policy learning methods. Despite its simplicity, xTED demonstrates superior performance in extensive simulation and real-robot experiments.
comment: xTED offers a novel, generic, flexible, simple and effective paradigm that casts cross-domain policy adaptation as a data pre-processing problem
Scaling Instructable Agents Across Many Simulated Worlds
Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructions across a diverse range of virtual 3D environments, including curated research environments as well as open-ended, commercial video games. Our goal is to develop an instructable agent that can accomplish anything a human can do in any simulated 3D environment. Our approach focuses on language-driven generality while imposing minimal assumptions. Our agents interact with environments in real-time using a generic, human-like interface: the inputs are image observations and language instructions and the outputs are keyboard-and-mouse actions. This general approach is challenging, but it allows agents to ground language across many visually complex and semantically rich environments while also allowing us to readily run agents in new environments. In this paper we describe our motivation and goal, the initial progress we have made, and promising preliminary results on several diverse research environments and a variety of commercial video games.
Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion Models
Diffusion models have seen rapid adoption in robotic imitation learning, enabling autonomous execution of complex dexterous tasks. However, action synthesis is often slow, requiring many steps of iterative denoising, limiting the extent to which models can be used in tasks that require fast reactive policies. To sidestep this, recent works have explored how the distillation of the diffusion process can be used to accelerate policy synthesis. However, distillation is computationally expensive and can hurt both the accuracy and diversity of synthesized actions. We propose SDP (Streaming Diffusion Policy), an alternative method to accelerate policy synthesis, leveraging the insight that generating a partially denoised action trajectory is substantially faster than a full output action trajectory. At each observation, our approach outputs a partially denoised action trajectory with variable levels of noise corruption, where the immediate action to execute is noise-free, with subsequent actions having increasing levels of noise and uncertainty. The partially denoised action trajectory for a new observation can then be quickly generated by applying a few steps of denoising to the previously predicted noisy action trajectory (rolled over by one timestep). We illustrate the efficacy of this approach, dramatically speeding up policy synthesis while preserving performance across both simulated and real-world settings.
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own
Reinforcement learning (RL) is a promising approach for solving robotic manipulation tasks. However, it is challenging to apply the RL algorithms directly in the real world. For one thing, RL is data-intensive and typically requires millions of interactions with environments, which are impractical in real scenarios. For another, it is necessary to make heavy engineering efforts to design reward functions manually. To address these issues, we leverage foundation models in this paper. We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models. Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions. The benefits of our framework are threefold: (1) \textit{sample efficient}; (2) \textit{minimal and effective reward engineering}; (3) \textit{agnostic to foundation model forms and robust to noisy priors}. Our method achieves remarkable performances in various manipulation tasks on both real robots and in simulation. Across 5 dexterous tasks with real robots, FAC achieves an average success rate of 86\% after one hour of real-time learning. Across 8 tasks in the simulated Meta-world, FAC achieves 100\% success rates in 7/8 tasks under less than 100k frames (about 1-hour training), outperforming baseline methods with manual-designed rewards in 1M frames. We believe the RLFP framework can enable future robots to explore and learn autonomously in the physical world for more tasks. Visualizations and code are available at \url{https://yewr.github.io/rlfp}.
comment: CoRL 2024 (Oral)
Zero-Shot Transfer of Neural ODEs
Autonomous systems often encounter environments and scenarios beyond the scope of their training data, which underscores a critical challenge: the need to generalize and adapt to unseen scenarios in real time. This challenge necessitates new mathematical and algorithmic tools that enable adaptation and zero-shot transfer. To this end, we leverage the theory of function encoders, which enables zero-shot transfer by combining the flexibility of neural networks with the mathematical principles of Hilbert spaces. Using this theory, we first present a method for learning a space of dynamics spanned by a set of neural ODE basis functions. After training, the proposed approach can rapidly identify dynamics in the learned space using an efficient inner product calculation. Critically, this calculation requires no gradient calculations or retraining during the online phase. This method enables zero-shot transfer for autonomous systems at runtime and opens the door for a new class of adaptable control algorithms. We demonstrate state-of-the-art system modeling accuracy for two MuJoCo robot environments and show that the learned models can be used for more efficient MPC control of a quadrotor.
Social Zone as a Barrier Function for Socially-Compliant Robot Navigation
This study addresses the challenge of integrating social norms into robot navigation, which is essential for ensuring that robots operate safely and efficiently in human-centric environments. Social norms, often unspoken and implicitly understood among people, are difficult to explicitly define and implement in robotic systems. To overcome this, we derive these norms from real human trajectory data, utilizing the comprehensive ATC dataset to identify the minimum social zones humans and robots must respect. These zones are integrated into the robot's navigation system by applying barrier functions, ensuring the robot consistently remains within the designated safety set. Simulation results demonstrate that our system effectively mimics human-like navigation strategies, such as passing on the right side and adjusting speed or pausing in constrained spaces. The proposed framework is versatile, easily comprehensible, and tunable, demonstrating the potential to advance the development of robots designed to navigate effectively in human-centric environments.
A Unification Between Deep-Learning Vision, Compartmental Dynamical Thermodynamics, and Robotic Manipulation for a Circular Economy
The shift from a linear to a circular economy has the potential to simultaneously reduce uncertainties of material supplies and waste generation. However, to date, the development of robotic and, more generally, autonomous systems have been rarely integrated into circular economy implementation strategies despite their potential to reduce the operational costs and the contamination risks from handling waste. In addition, the science of circularity still lacks the physical foundations needed to improve the accuracy and the repeatability of the models. Hence, in this paper, we merge deep-learning vision, compartmental dynamical thermodynamics, and robotic manipulation into a theoretically-coherent physics-based research framework to lay the foundations of circular flow designs of materials. The proposed framework tackles circularity by generalizing the design approach of the Rankine cycle enhanced with dynamical systems theory. This differs from state-of-the-art approaches to circular economy, which are mainly based on data analysis, e.g., material flow analysis (MFA). We begin by reviewing the literature of the three abovementioned research areas, then we introduce the proposed unified framework and we report the initial application of the framework to plastics systems along with initial simulation results of reinforcement-learning control of robotic waste sorting. This shows the framework applicability, generality, scalability, and the similarity and difference between the optimization of artificial neural systems and the proposed compartmental networks. Finally, we discuss the still not fully exploited opportunities for robotics in circular economy and the future challenges in the theory and practice of the proposed circularity framework.
comment: To be submitted
The 1st InterAI Workshop: Interactive AI for Human-centered Robotics
The workshop is affiliated with 33nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2024) August 26~30, 2023 / Pasadena, CA, USA. It is designed as a half-day event, extending over four hours from 9:00 to 12:30 PST time. It accommodates both in-person and virtual attendees (via Zoom), ensuring a flexible participation mode. The agenda is thoughtfully crafted to include a diverse range of sessions: two keynote speeches that promise to provide insightful perspectives, two dedicated paper presentation sessions, an interactive panel discussion to foster dialogue among experts which facilitates deeper dives into specific topics, and a 15-minute coffee break. The workshop website: https://sites.google.com/view/interaiworkshops/home.
RiskMap: A Unified Driving Context Representation for Autonomous Motion Planning in Urban Driving Environment
Motion planning is a complicated task that requires the combination of perception, map information integration and prediction, particularly when driving in heavy traffic. Developing an extensible and efficient representation that visualizes sensor noise and provides basis to real-time planning tasks is desirable. We aim to develop an interpretable map representation, which offers prior of driving cost in planning tasks. In this way, we can simplify the planning process for dealing with complex driving scenarios and visualize sensor noise. Specifically, we propose a unified context representation empowered by deep neural networks. The unified representation is a differentiable risk field, which is an analytical representation of statistical cognition regarding traffic participants for downstream planning tasks. This representation method is nominated as RiskMap. A sampling-based planner is adopted to train and compare RiskMap generation methods. In this paper, the RiskMap generation tools and model structures are explored, the results illustrate that our method can improve driving safety and smoothness, and the limitation of our method is also discussed.
comment: Accepted on 8th Oct 2024
A Bayesian Framework for Active Tactile Object Recognition, Pose Estimation and Shape Transfer Learning
As humans can explore and understand the world through active touch, similar capability is desired for robots. In this paper, we address the problem of active tactile object recognition, pose estimation and shape transfer learning, where a customized particle filter (PF) and Gaussian process implicit surface (GPIS) is combined in a unified Bayesian framework. Upon new tactile input, the customized PF updates the joint distribution of the object class and object pose while tracking the novelty of the object. Once a novel object is identified, its shape will be reconstructed using GPIS. By grounding the prior of the GPIS with the maximum-a-posteriori (MAP) estimation from the PF, the knowledge about known shapes can be transferred to learn novel shapes. An exploration procedure based on global shape estimation is proposed to guide active data acquisition and terminate the exploration upon sufficient information. Through experiments in simulation, the proposed framework demonstrated its effectiveness and efficiency in estimating object class and pose for known objects and learning novel shapes. Furthermore, it can recognize previously learned shapes reliably.
CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series
The study of cause-and-effect is of the utmost importance in many branches of science, but also for many practical applications of intelligent systems. In particular, identifying causal relationships in situations that include hidden factors is a major challenge for methods that rely solely on observational data for building causal models. This paper proposes CAnDOIT, a causal discovery method to reconstruct causal models using both observational and interventional time-series data. The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics, where the scenario is highly complex and observational data alone are often insufficient to uncover the correct causal structure. Validation of the method is performed initially on randomly generated synthetic models and subsequently on a well-known benchmark for causal structure learning in a robotic manipulation environment. The experiments demonstrate that the approach can effectively handle data from interventions and exploit them to enhance the accuracy of the causal analysis. A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub: https://github.com/lcastri/causalflow.
comment: Published in Advanced Intelligent Systems
MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while robot dogs and humanoids have recently emerged in the street. Micromobility enabled by AI for short-distance travel in public urban spaces plays a crucial component in the future transportation system. Ensuring the generalizability and safety of AI models maneuvering mobile machines is essential. In this work, we present MetaUrban, a compositional simulation platform for the AI-driven urban micromobility research. MetaUrban can construct an infinite number of interactive urban scenes from compositional elements, covering a vast array of ground plans, object placements, pedestrians, vulnerable road users, and other mobile agents' appearances and dynamics. We design point navigation and social navigation tasks as the pilot study using MetaUrban for urban micromobility research and establish various baselines of Reinforcement Learning and Imitation Learning. We conduct extensive evaluation across mobile machines, demonstrating that heterogeneous mechanical structures significantly influence the learning and execution of AI policies. We perform a thorough ablation study, showing that the compositional nature of the simulated environments can substantially improve the generalizability and safety of the trained mobile agents. MetaUrban will be made publicly available to provide research opportunities and foster safe and trustworthy embodied AI and micromobility in cities. The code and dataset will be publicly available.
comment: Technical report. Project page: https://metadriverse.github.io/metaurban/
Compact Multi-Object Placement Using Adjacency-Aware Reinforcement Learning
Close and precise placement of irregularly shaped objects requires a skilled robotic system. The manipulation of objects that have sensitive top surfaces and a fixed set of neighbors is particularly challenging. To avoid damaging the surface, the robot has to grasp them from the side, and during placement, it has to maintain the spatial relations with adjacent objects, while considering the physical gripper extent. In this work, we propose a framework to learn an agent based on reinforcement learning that generates end-effector motions for placing objects as closely as possible to one another. During the placement, our agent considers the spatial constraints with neighbors defined in a given layout of the objects while avoiding collisions. Our approach learns to place compact object assemblies without the need for predefined spacing between objects, as required by traditional methods. We thoroughly evaluated our approach using a two-finger gripper mounted on a robotic arm with six degrees of freedom. The results demonstrate that our agent significantly outperforms two baseline approaches in object assembly compactness, thereby reducing the space required to position the objects while adhering to specified spatial constraints.
comment: Accepted to IEEE-RAS International Conference on Humanoid Robots (Humanoids) 2024
Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation
Large Language Models (LLMs) have been recently used in robot applications for grounding LLM common-sense reasoning with the robot's perception and physical abilities. In humanoid robots, memory also plays a critical role in fostering real-world embodiment and facilitating long-term interactive capabilities, especially in multi-task setups where the robot must remember previous task states, environment states, and executed actions. In this paper, we address incorporating memory processes with LLMs for generating cross-task robot actions, while the robot effectively switches between tasks. Our proposed dual-layered architecture features two LLMs, utilizing their complementary skills of reasoning and following instructions, combined with a memory model inspired by human cognition. Our results show a significant improvement in performance over a baseline of five robotic tasks, demonstrating the potential of integrating memory with LLMs for combining the robot's action and perception for adaptive task execution.
Guided Decoding for Robot On-line Motion Generation and Adaption
We present a novel motion generation approach for robot arms, with high degrees of freedom, in complex settings that can adapt online to obstacles or new via points. Learning from Demonstration facilitates rapid adaptation to new tasks and optimizes the utilization of accumulated expertise by allowing robots to learn and generalize from demonstrated trajectories. We train a transformer architecture, based on conditional variational autoencoder, on a large dataset of simulated trajectories used as demonstrations. Our architecture learns essential motion generation skills from these demonstrations and is able to adapt them to meet auxiliary tasks. Additionally, our approach implements auto-regressive motion generation to enable real-time adaptations, as, for example, introducing or changing via-points, and velocity and acceleration constraints. Using beam search, we present a method for further adaption of our motion generator to avoid obstacles. We show that our model successfully generates motion from different initial and target points and that is capable of generating trajectories that navigate complex tasks across different robotic platforms.
comment: IEEE-RAS International Conference on Humanoid Robots, 2024
FlowRetrieval: Flow-Guided Data Retrieval for Few-Shot Imitation Learning
Few-shot imitation learning relies on only a small amount of task-specific demonstrations to efficiently adapt a policy for a given downstream tasks. Retrieval-based methods come with a promise of retrieving relevant past experiences to augment this target data when learning policies. However, existing data retrieval methods fall under two extremes: they either rely on the existence of exact behaviors with visually similar scenes in the prior data, which is impractical to assume; or they retrieve based on semantic similarity of high-level language descriptions of the task, which might not be that informative about the shared low-level behaviors or motions across tasks that is often a more important factor for retrieving relevant data for policy learning. In this work, we investigate how we can leverage motion similarity in the vast amount of cross-task data to improve few-shot imitation learning of the target task. Our key insight is that motion-similar data carries rich information about the effects of actions and object interactions that can be leveraged during few-shot adaptation. We propose FlowRetrieval, an approach that leverages optical flow representations for both extracting similar motions to target tasks from prior data, and for guiding learning of a policy that can maximally benefit from such data. Our results show FlowRetrieval significantly outperforms prior methods across simulated and real-world domains, achieving on average 27% higher success rate than the best retrieval-based prior method. In the Pen-in-Cup task with a real Franka Emika robot, FlowRetrieval achieves 3.7x the performance of the baseline imitation learning technique that learns from all prior and target data. Website: https://flow-retrieval.github.io
FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality
Generating safety-critical scenarios, which are essential yet difficult to collect at scale, offers an effective method to evaluate the robustness of autonomous vehicles (AVs). Existing methods focus on optimizing adversariality while preserving the naturalness of scenarios, aiming to achieve a balance through data-driven approaches. However, without an appropriate upper bound for adversariality, the scenarios might exhibit excessive adversariality, potentially leading to unavoidable collisions. In this paper, we introduce FREA, a novel safety-critical scenarios generation method that incorporates the Largest Feasible Region (LFR) of AV as guidance to ensure the reasonableness of the adversarial scenarios. Concretely, FREA initially pre-calculates the LFR of AV from offline datasets. Subsequently, it learns a reasonable adversarial policy that controls the scene's critical background vehicles (CBVs) to generate adversarial yet AV-feasible scenarios by maximizing a novel feasibility-dependent adversarial objective function. Extensive experiments illustrate that FREA can effectively generate safety-critical scenarios, yielding considerable near-miss events while ensuring AV's feasibility. Generalization analysis also confirms the robustness of FREA in AV testing across various surrogate AV methods and traffic environments.
comment: Accepted by CoRL 2024
Solving Robotics Problems in Zero-Shot with Vision-Language Models
We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework designed to solve robotics problems in a zero-shot regime. In our context, zero-shot means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description, and the VLLM outputs the sequence of actions necessary for the robot to complete the task. Unlike prior work that requires fine-tuning parts of the pipeline -- such as adjusting an LLM on robot-specific data or training separate vision encoders -- our approach demonstrates that with careful engineering, a single off-the-shelf VLLM can autonomously handle all aspects of a robotics task, from high-level planning to low-level location extraction and action execution. Crucially, compared to using GPT-4o alone, Wonderful Team is self-corrective and capable of iteratively fixing its own mistakes, enabling it to solve challenging long-horizon tasks. We validate our framework through extensive experiments, both in simulated environments using VIMABench and in real-world settings. Our system showcases the ability to handle diverse tasks such as manipulation, goal-reaching, and visual reasoning -- all in a zero-shot manner. These results underscore a key point: vision-language models have progressed rapidly in the past year and should be strongly considered as a backbone for many robotics problems moving forward.
comment: aka Wonderful Team
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability
Recent advancements in Large Language Models (LLMs) have showcased their ability to perform complex reasoning tasks, but their effectiveness in planning remains underexplored. In this study, we evaluate the planning capabilities of OpenAI's o1 models across a variety of benchmark tasks, focusing on three key aspects: feasibility, optimality, and generalizability. Through empirical evaluations on constraint-heavy tasks (e.g., $\textit{Barman}$, $\textit{Tyreworld}$) and spatially complex environments (e.g., $\textit{Termes}$, $\textit{Floortile}$), we highlight o1-preview's strengths in self-evaluation and constraint-following, while also identifying bottlenecks in decision-making and memory management, particularly in tasks requiring robust spatial reasoning. Our results reveal that o1-preview outperforms GPT-4 in adhering to task constraints and managing state transitions in structured environments. However, the model often generates suboptimal solutions with redundant actions and struggles to generalize effectively in spatially complex tasks. This pilot study provides foundational insights into the planning limitations of LLMs, offering key directions for future research on improving memory management, decision-making, and generalization in LLM-based planning. Code available at: $\href{https://github.com/VITA-Group/o1-planning}{\text{https://github.com/VITA-Group/o1-planning}}$.
comment: Updated link to code repository
E2H: A Two-Stage Non-Invasive Neural Signal Driven Humanoid Robotic Whole-Body Control Framework
Recent advancements in humanoid robotics, including the integration of hierarchical reinforcement learning-based control and the utilization of LLM planning, have significantly enhanced the ability of robots to perform complex tasks. In contrast to the highly developed humanoid robots, the human factors involved remain relatively unexplored. Directly controlling humanoid robots with the brain has already appeared in many science fiction novels, such as Pacific Rim and Gundam. In this work, we present E2H (EEG-to-Humanoid), an innovative framework that pioneers the control of humanoid robots using high-frequency non-invasive neural signals. As the none-invasive signal quality remains low in decoding precise spatial trajectory, we decompose the E2H framework in an innovative two-stage formation: 1) decoding neural signals (EEG) into semantic motion keywords, 2) utilizing LLM facilitated motion generation with a precise motion imitation control policy to realize humanoid robotics control. The method of directly driving robots with brainwave commands offers a novel approach to human-machine collaboration, especially in situations where verbal commands are impractical, such as in cases of speech impairments, space exploration, or underwater exploration, unlocking significant potential. E2H offers an exciting glimpse into the future, holding immense potential for human-computer interaction.
Coverage Path Planning For Minimizing Expected Time to Search For an Object With Continuous Sensing
In this paper, we present several results of both theoretical as well as practical interests. First, we propose the quota lawn mowing problem, an extension of the classic lawn mowing problem in computational geometry, as follows: given a quota of coverage, compute the shortest lawn mowing route to achieve said quota. We give constant-factor approximations for the quota lawn mowing problem. Second, we investigate the expected detection time minimization problem in geometric coverage path planning with local, continuous sensory information. We provide the first approximation algorithm with provable error bounds with pseudopolynomial running time. Our ideas also extend to another search mechanism, namely visibility-based search, which is related to the watchman route problem. We complement our theoretical analysis with some simple but effective heuristics for finding an object in minimum expected time, on which we provide simulation results.
DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability
Generative models such as diffusion models, excel at capturing high-dimensional distributions with diverse input modalities, e.g. robot trajectories, but are less effective at multi-step constraint reasoning. Task and Motion Planning (TAMP) approaches are suited for planning multi-step autonomous robot manipulation. However, it can be difficult to apply them to domains where the environment and its dynamics are not fully known. We propose to overcome these limitations by composing diffusion models using a TAMP system. We use the learned components for constraints and samplers that are difficult to engineer in the planning model, and use a TAMP solver to search for the task plan with constraint-satisfying action parameter values. To tractably make predictions for unseen objects in the environment, we define the learned samplers and TAMP operators on learned latent embedding of changing object states. We evaluate our approach in a simulated articulated object manipulation domain and show how the combination of classical TAMP, generative modeling, and latent embedding enables multi-step constraint-based reasoning. We also apply the learned sampler in the real world. Website: https://sites.google.com/view/dimsam-tamp
Learning Shared RGB-D Fields: Unified Self-supervised Pre-training for Label-efficient LiDAR-Camera 3D Perception
Constructing large-scale labeled datasets for multi-modal perception model training in autonomous driving presents significant challenges. This has motivated the development of self-supervised pretraining strategies. However, existing pretraining methods mainly employ distinct approaches for each modality. In contrast, we focus on LiDAR-Camera 3D perception models and introduce a unified pretraining strategy, NeRF-Supervised Masked Auto Encoder (NS-MAE), which optimizes all modalities through a shared formulation. NS-MAE leverages NeRF's ability to encode both appearance and geometry, enabling efficient masked reconstruction of multi-modal data. Specifically, embeddings are extracted from corrupted LiDAR point clouds and images, conditioned on view directions and locations. Then, these embeddings are rendered into multi-modal feature maps from two crucial viewpoints for 3D driving perception: perspective and bird's-eye views. The original uncorrupted data serve as reconstruction targets for self-supervised learning. Extensive experiments demonstrate the superior transferability of NS-MAE across various 3D perception tasks under different fine-tuning settings. Notably, NS-MAE outperforms prior SOTA pre-training methods that employ separate strategies for each modality in BEV map segmentation under the label-efficient fine-tuning setting. Our code is publicly available at https://github.com/Xiaohao-Xu/Unified-Pretrain-AD/ .
comment: 8 pages
Equivariant Diffusion Policy
Recent work has shown diffusion models are an effective approach to learning the multimodal distributions arising from demonstration data in behavior cloning. However, a drawback of this approach is the need to learn a denoising function, which is significantly more complex than learning an explicit policy. In this work, we propose Equivariant Diffusion Policy, a novel diffusion policy learning method that leverages domain symmetries to obtain better sample efficiency and generalization in the denoising function. We theoretically analyze the $\mathrm{SO}(2)$ symmetry of full 6-DoF control and characterize when a diffusion model is $\mathrm{SO}(2)$-equivariant. We furthermore evaluate the method empirically on a set of 12 simulation tasks in MimicGen, and show that it obtains a success rate that is, on average, 21.9% higher than the baseline Diffusion Policy. We also evaluate the method on a real-world system to show that effective policies can be learned with relatively few training samples, whereas the baseline Diffusion Policy cannot.
comment: Conference on Robot Learning 2024, Oral
Fast Decentralized State Estimation for Legged Robot Locomotion via EKF and MHE
In this paper, we present a fast and decentralized state estimation framework for the control of legged locomotion. The nonlinear estimation of the floating base states is decentralized to an orientation estimation via Extended Kalman Filter (EKF) and a linear velocity estimation via Moving Horizon Estimation (MHE). The EKF fuses the inertia sensor with vision to estimate the floating base orientation. The MHE uses the estimated orientation with all the sensors within a time window in the past to estimate the linear velocities based on a time-varying linear dynamics formulation of the interested states with state constraints. More importantly, a marginalization method based on the optimization structure of the full information filter (FIF) is proposed to convert the equality-constrained FIF to an equivalent MHE. This decoupling of state estimation promotes the desired balance of computation efficiency, accuracy of estimation, and the inclusion of state constraints. The proposed method is shown to be capable of providing accurate state estimation to several legged robots, including the highly dynamic hopping robot PogoX, the bipedal robot Cassie, and the quadrupedal robot Unitree Go1, with a frequency at 200 Hz and a window interval of 0.1s.
comment: 8 pages, accepted by RAL 2024
Multiagent Systems
PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents
Ptychography is an advanced computational imaging technique in X-ray and electron microscopy. It has been widely adopted across scientific research fields, including physics, chemistry, biology, and materials science, as well as in industrial applications such as semiconductor characterization. In practice, obtaining high-quality ptychographic images requires simultaneous optimization of numerous experimental and algorithmic parameters. Traditionally, parameter selection often relies on trial and error, leading to low-throughput workflows and potential human bias. In this work, we develop the "Ptychographic Experiment and Analysis Robot" (PEAR), a framework that leverages large language models (LLMs) to automate data analysis in ptychography. To ensure high robustness and accuracy, PEAR employs multiple LLM agents for tasks including knowledge retrieval, code generation, parameter recommendation, and image reasoning. Our study demonstrates that PEAR's multi-agent design significantly improves the workflow success rate, even with smaller open-weight models such as LLaMA 3.1 8B. PEAR also supports various automation levels and is designed to work with customized local knowledge bases, ensuring flexibility and adaptability across different research environments.
comment: 18 pages, 5 figures, technical preview report
The Dynamics of Social Conventions in LLM populations: Spontaneous Emergence, Collective Biases and Tipping Points
Social conventions are the foundation for social and economic life. As legions of AI agents increasingly interact with each other and with humans, their ability to form shared conventions will determine how effectively they will coordinate behaviors, integrate into society and influence it. Here, we investigate the dynamics of conventions within populations of Large Language Model (LLM) agents using simulated interactions. First, we show that globally accepted social conventions can spontaneously arise from local interactions between communicating LLMs. Second, we demonstrate how strong collective biases can emerge during this process, even when individual agents appear to be unbiased. Third, we examine how minority groups of committed LLMs can drive social change by establishing new social conventions. We show that once these minority groups reach a critical size, they can consistently overturn established behaviors. In all cases, contrasting the experimental results with predictions from a minimal multi-agent model allows us to isolate the specific role of LLM agents. Our results clarify how AI systems can autonomously develop norms without explicit programming and have implications for designing AI systems that align with human values and societal goals.
PILLAR: an AI-Powered Privacy Threat Modeling Tool
The rapid evolution of Large Language Models (LLMs) has unlocked new possibilities for applying artificial intelligence across a wide range of fields, including privacy engineering. As modern applications increasingly handle sensitive user data, safeguarding privacy has become more critical than ever. To protect privacy effectively, potential threats need to be identified and addressed early in the system development process. Frameworks like LINDDUN offer structured approaches for uncovering these risks, but despite their value, they often demand substantial manual effort, expert input, and detailed system knowledge. This makes the process time-consuming and prone to errors. Current privacy threat modeling methods, such as LINDDUN, typically rely on creating and analyzing complex data flow diagrams (DFDs) and system descriptions to pinpoint potential privacy issues. While these approaches are thorough, they can be cumbersome, relying heavily on the precision of the data provided by users. Moreover, they often generate a long list of threats without clear guidance on how to prioritize them, leaving developers unsure of where to focus their efforts. In response to these challenges, we introduce PILLAR (Privacy risk Identification with LINDDUN and LLM Analysis Report), a new tool that integrates LLMs with the LINDDUN framework to streamline and enhance privacy threat modeling. PILLAR automates key parts of the LINDDUN process, such as generating DFDs, classifying threats, and prioritizing risks. By leveraging the capabilities of LLMs, PILLAR can take natural language descriptions of systems and transform them into comprehensive threat models with minimal input from users, reducing the workload on developers and privacy experts while improving the efficiency and accuracy of the process.
Edge AI Collaborative Learning: Bayesian Approaches to Uncertainty Estimation
Recent advancements in edge computing have significantly enhanced the AI capabilities of Internet of Things (IoT) devices. However, these advancements introduce new challenges in knowledge exchange and resource management, particularly addressing the spatiotemporal data locality in edge computing environments. This study examines algorithms and methods for deploying distributed machine learning within autonomous, network-capable, AI-enabled edge devices. We focus on determining confidence levels in learning outcomes considering the spatial variability of data encountered by independent agents. Using collaborative mapping as a case study, we explore the application of the Distributed Neural Network Optimization (DiNNO) algorithm extended with Bayesian neural networks (BNNs) for uncertainty estimation. We implement a 3D environment simulation using the Webots platform to simulate collaborative mapping tasks, decouple the DiNNO algorithm into independent processes for asynchronous network communication in distributed learning, and integrate distributed uncertainty estimation using BNNs. Our experiments demonstrate that BNNs can effectively support uncertainty estimation in a distributed learning context, with precise tuning of learning hyperparameters crucial for effective uncertainty assessment. Notably, applying Kullback-Leibler divergence for parameter regularization resulted in a 12-30% reduction in validation loss during distributed BNN training compared to other regularization strategies.
Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning NeurIPS 2024
In multi-agent reinforcement learning (MARL), parameter sharing is commonly employed to enhance sample efficiency. However, the popular approach of full parameter sharing often leads to homogeneous policies among agents, potentially limiting the performance benefits that could be derived from policy diversity. To address this critical limitation, we introduce \emph{Kaleidoscope}, a novel adaptive partial parameter sharing scheme that fosters policy heterogeneity while still maintaining high sample efficiency. Specifically, Kaleidoscope maintains one set of common parameters alongside multiple sets of distinct, learnable masks for different agents, dictating the sharing of parameters. It promotes diversity among policy networks by encouraging discrepancy among these masks, without sacrificing the efficiencies of parameter sharing. This design allows Kaleidoscope to dynamically balance high sample efficiency with a broad policy representational capacity, effectively bridging the gap between full parameter sharing and non-parameter sharing across various environments. We further extend Kaleidoscope to critic ensembles in the context of actor-critic algorithms, which could help improve value estimations.Our empirical evaluations across extensive environments, including multi-agent particle environment, multi-agent MuJoCo and StarCraft multi-agent challenge v2, demonstrate the superior performance of Kaleidoscope compared with existing parameter sharing approaches, showcasing its potential for performance enhancement in MARL. The code is publicly available at \url{https://github.com/LXXXXR/Kaleidoscope}.
comment: Accepted by the Thirty-Eighth Annual Conference on Neural Information Processing Systems(NeurIPS 2024)
Two-person positive shortest path games have Nash equlibria in pure stationary strategies
We prove that every finite two-person positive shortest path game has a Nash equilibrium (NE) in pure stationary strategies, which can be computed in polynomial time. The existence result holds also for graphs with finite out-degrees. Moreover, we prove that a terminal NE exists provided at least one of two players can guarantee reaching a terminal. If no one can do it, in other words, if each of two players can cut all terminals from the initial position $s$, then, obviously, a cyclic NE exists, although its cost is infinite for both players, since we restrict ourselves to positive games. We conjecture that a terminal NE exists too, provided there exists a directed path from $s$ to a terminal. However, this is open.
The Condorcet Dimension of Metric Spaces
A Condorcet winning set is a set of candidates such that no other candidate is preferred by at least half the voters over all members of the set. The Condorcet dimension, which is the minimum cardinality of a Condorcet winning set, is known to be at most logarithmic in the number of candidates. We study the case of elections where voters and candidates are located in a $2$-dimensional space with preferences based upon proximity voting. Our main result is that the Condorcet dimension is at most $3$, under both the Manhattan norm and the infinity norm, natural measures in electoral systems. We also prove that any set of voter preferences can be embedded into a metric space of sufficiently high dimension for any $p$-norm, including the Manhattan and infinity norms.
comment: 9 pages
Multi-Agent Actor-Critics in Autonomous Cyber Defense
The need for autonomous and adaptive defense mechanisms has become paramount in the rapidly evolving landscape of cyber threats. Multi-Agent Deep Reinforcement Learning (MADRL) presents a promising approach to enhancing the efficacy and resilience of autonomous cyber operations. This paper explores the application of Multi-Agent Actor-Critic algorithms which provides a general form in Multi-Agent learning to cyber defense, leveraging the collaborative interactions among multiple agents to detect, mitigate, and respond to cyber threats. We demonstrate each agent is able to learn quickly and counter act on the threats autonomously using MADRL in simulated cyber-attack scenarios. The results indicate that MADRL can significantly enhance the capability of autonomous cyber defense systems, paving the way for more intelligent cybersecurity strategies. This study contributes to the growing body of knowledge on leveraging artificial intelligence for cybersecurity and sheds light for future research and development in autonomous cyber operations.
comment: 6 pages. 2 figures
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
Diplomacy is one of the most sophisticated activities in human society, involving complex interactions among multiple parties that require skills in social reasoning, negotiation, and long-term strategic planning. Previous AI agents have demonstrated their ability to handle multi-step games and large action spaces in multi-agent tasks. However, diplomacy involves a staggering magnitude of decision spaces, especially considering the negotiation stage required. While recent agents based on large language models (LLMs) have shown potential in various applications, they still struggle with extended planning periods in complex multi-agent settings. Leveraging recent technologies for LLM-based agents, we aim to explore AI's potential to create a human-like agent capable of executing comprehensive multi-agent missions by integrating three fundamental capabilities: 1) strategic planning with memory and reflection; 2) goal-oriented negotiation with social reasoning; and 3) augmenting memory through self-play games for self-evolution without human in the loop.
Concurrent-Learning Based Relative Localization in Shape Formation of Robot Swarms
In this paper, we address the shape formation problem for massive robot swarms in environments where external localization systems are unavailable. Achieving this task effectively with solely onboard measurements is still scarcely explored and faces some practical challenges. To solve this challenging problem, we propose the following novel results. Firstly, to estimate the relative positions among neighboring robots, a concurrent-learning based estimator is proposed. It relaxes the persistent excitation condition required in the classical ones such as least-square estimator. Secondly, we introduce a finite-time agreement protocol to determine the shape location. This is achieved by estimating the relative position between each robot and a randomly assigned seed robot. The initial position of the seed one marks the shape location. Thirdly, based on the theoretical results of the relative localization, a novel behavior-based control strategy is devised. This strategy not only enables adaptive shape formation of large group of robots but also enhances the observability of inter-robot relative localization. Numerical simulation results are provided to verify the performance of our proposed strategy compared to the state-of-the-art ones. Additionally, outdoor experiments on real robots further demonstrate the practical effectiveness and robustness of our methods.
The Patterns of Life Human Mobility Simulation SP
We demonstrate the Patterns of Life Simulation to create realistic simulations of human mobility in a city. This simulation has recently been used to generate massive amounts of trajectory and check-in data. Our demonstration focuses on using the simulation twofold: (1) using the graphical user interface (GUI), and (2) running the simulation headless by disabling the GUI for faster data generation. We further demonstrate how the Patterns of Life simulation can be used to simulate any region on Earth by using publicly available data from OpenStreetMap. Finally, we also demonstrate recent improvements to the scalability of the simulation allows simulating up to 100,000 individual agents for years of simulation time. During our demonstration, as well as offline using our guides on GitHub, participants will learn: (1) The theories of human behavior driving the Patters of Life simulation, (2) how to simulate to generate massive amounts of synthetic yet realistic trajectory data, (3) running the simulation for a region of interest chosen by participants using OSM data, (4) learn the scalability of the simulation and understand the properties of generated data, and (5) manage thousands of parallel simulation instances running concurrently.
comment: Accepted paper to SIGSPATIAL 2024 main conference
No-Regret Learning for Stackelberg Equilibrium Computation in Newsvendor Pricing Games
We introduce the application of online learning in a Stackelberg game pertaining to a system with two learning agents in a dyadic exchange network, consisting of a supplier and retailer, specifically where the parameters of the demand function are unknown. In this game, the supplier is the first-moving leader, and must determine the optimal wholesale price of the product. Subsequently, the retailer who is the follower, must determine both the optimal procurement amount and selling price of the product. In the perfect information setting, this is known as the classical price-setting Newsvendor problem, and we prove the existence of a unique Stackelberg equilibrium when extending this to a two-player pricing game. In the framework of online learning, the parameters of the reward function for both the follower and leader must be learned, under the assumption that the follower will best respond with optimism under uncertainty. A novel algorithm based on contextual linear bandits with a measurable uncertainty set is used to provide a confidence bound on the parameters of the stochastic demand. Consequently, optimal finite time regret bounds on the Stackelberg regret, along with convergence guarantees to an approximate Stackelberg equilibrium, are provided.
comment: Stackelberg Games, Online Learning, Dynamic Pricing
Systems and Control (CS)
Towards a Health-Based Power Grid Optimization in the Artificial Intelligence Era
The electric power sector is one of the largest contributors to greenhouse gas emissions in the world. In recent years, there has been an unprecedented increase in electricity demand driven by the so-called Artificial Intelligence (AI) revolution. Although AI has and will continue to have a transformative impact, its environmental and health impacts are often overlooked. The standard approach to power grid optimization aims to minimize CO$_2$ emissions. In this paper, we propose a new holistic paradigm. Our proposed optimization directly targets the minimization of adverse health outcomes under energy efficiency and emission constraints. We show the first example of an optimal fuel mix allocation problem aiming to minimize the average number of adverse health effects resulting from exposure to hazardous air pollutants with constraints on the average and marginal emissions. We argue that this new health-based power grid optimization is essential to promote truly sustainable technological advances that align both with global climate goals and public health priorities.
comment: 5 pages, 1 figure
Transformer Temperature Management and Voltage Control in Electric Distribution Systems with High Solar PV Penetration
The increasing penetration of photovoltaic (PV) systems in distribution grids can lead to overvoltage and transformer overloading issues. While voltage regulation has been extensively studied and some research has addressed transformer temperature control, there is limited work on simultaneously managing both challenges. This paper addresses this gap by proposing an optimization-based strategy that efficiently manages voltage regulation and transformer temperature while minimizing the curtailment of PV generation. In order to make this problem convex, a relaxation is applied to the transformer temperature dynamics constraint. We also provide analysis to determine under which conditions this relaxation remains tight. The proposed approach is validated through simulations, demonstrating its effectiveness in achieving the desired control objectives.
Hybrid LLM-DDQN based Joint Optimization of V2I Communication and Autonomous Driving
Large language models (LLMs) have received considerable interest recently due to their outstanding reasoning and comprehension capabilities. This work explores applying LLMs to vehicular networks, aiming to jointly optimize vehicle-to-infrastructure (V2I) communications and autonomous driving (AD) policies. We deploy LLMs for AD decision-making to maximize traffic flow and avoid collisions for road safety, and a double deep Q-learning algorithm (DDQN) is used for V2I optimization to maximize the received data rate and reduce frequent handovers. In particular, for LLM-enabled AD, we employ the Euclidean distance to identify previously explored AD experiences, and then LLMs can learn from past good and bad decisions for further improvement. Then, LLM-based AD decisions will become part of states in V2I problems, and DDQN will optimize the V2I decisions accordingly. After that, the AD and V2I decisions are iteratively optimized until convergence. Such an iterative optimization approach can better explore the interactions between LLMs and conventional reinforcement learning techniques, revealing the potential of using LLMs for network optimization and management. Finally, the simulations demonstrate that our proposed hybrid LLM-DDQN approach outperforms the conventional DDQN algorithm, showing faster convergence and higher average rewards.
comment: Submission for possible publication
Robust Variable-Horizon MPC with Adaptive Terminal Constraints
This paper presents a novel robust variable-horizon model predictive control scheme designed to intercept a target moving along a known trajectory, in finite time. Linear discrete-time systems affected by bounded process disturbances are considered and a tube-based MPC approach is adopted. The main contribution is an adaptive mechanism for choosing the terminal constraint set sequence in the MPC optimization problem. This mechanism is designed to ensure recursive feasibility while promoting minimization of the final distance to the target. Finite-time convergence of the proposed control scheme is proven. In order to evaluate its effectiveness, the designed control law is tested through numerical simulations, including a case study involving orbital rendezvous of a satellite with a tumbling object. The results indicate a significant reduction in conservatism compared to existing state-of-the-art methods using a fixed terminal set sequence.
Privacy-Preserving Optimal State Estimation with Low Complexity via Cramér-Rao Lower Bound Approach
This paper addresses the optimal state estimation problem for dynamic systems while preserving private information against an adversary. To dominate the adversary's estimation accuracy about private information in the mean square error (MSE) sense, the Cram\'er-Rao lower bound (CRLB) is employed to evaluate privacy level. The problem is formulated as a constrained optimization, which minimizes the MSE of the state estimate with a constraint on privacy level, achieving a trade-off between privacy and utility. To solve the constrained optimization problem, an explicit expression for CRLB is first provided using the information inequality. To overcome the increasing sizes of the involved matrices over time, a low-complexity approach is then proposed to achieve online calculation for CRLB, significantly reducing computational complexity. Next, the optimization problem is relaxed to a semi-definite programming problem, and a relaxed solution is provided. Finally, a privacy-preserving state estimation algorithm with low complexity is developed and proved to achieve differential privacy. Two illustrative examples, including a practical case of building occupancy, demonstrate the effectiveness of the proposed algorithm.
MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation
Autonomous driving necessitates advanced object detection techniques that integrate information from multiple modalities to overcome the limitations associated with single-modal approaches. The challenges of aligning diverse data in early fusion and the complexities, along with overfitting issues introduced by deep fusion, underscore the efficacy of late fusion at the decision level. Late fusion ensures seamless integration without altering the original detector's network structure. This paper introduces a pioneering Multi-modal Multi-class Late Fusion method, designed for late fusion to enable multi-class detection. Fusion experiments conducted on the KITTI validation and official test datasets illustrate substantial performance improvements, presenting our model as a versatile solution for multi-modal object detection in autonomous driving. Moreover, our approach incorporates uncertainty analysis into the classification fusion process, rendering our model more transparent and trustworthy and providing more reliable insights into category predictions.
Data-driven Feedback Control of Lattice Structures with Localized Actuation and Sensing
Assembling lattices from discrete building blocks enables the composition of large, heterogeneous, and easily reconfigurable objects with desirable mass-to-stiffness ratios. This type of building system may also be referred to as a digital material, as it is constituted from discrete, error-correcting components. Researchers have demonstrated various active structures and even robotic systems that take advantage of the reconfigurable, mass-efficient properties of discrete lattice structures. However, the existing literature has predominantly used open-loop control strategies, limiting the performance of the presented systems. In this paper, we present a novel approach to feedback control of digital lattice structures, leveraging real-time measurements of the system dynamics. We introduce an actuated voxel which constitutes a novel means for actuation of lattice structures. Our control method is based on the Extended Dynamical Mode Decomposition algorithm in conjunction with the Linear Quadratic Regulator and the Koopman Model Predictive Control. The key advantage of our approach lies in its purely data-driven nature, without the need for any prior knowledge of a system's structure. We illustrate the developed method via real experiments with custom-built flexible lattice beam, showing its ability to accomplish various tasks even with minimal sensing and actuation resources. In particular, we address two problems: stabilization together with disturbance attenuation, and reference tracking.
Achieving multi uav best viewpoint coordination in obstructed environments
Wildfire suppression is a complex task that poses high risks to humans. Using robotic teams for wildfire suppression enhances the safety and efficiency of detecting, monitoring, and extinguishing fires. We propose a control architecture based on task hierarchical control for the autonomous steering of a system of flying robots in wildfire suppression. We incorporate a novel line-of-sight obstacle avoidance method that calculates the best viewpoints and ensures an occlusion-free view for the suppression robot during the mission. Path integral control generates optimal trajectories towards the goals. We conduct an ablation study to assess the effectiveness of our approach by comparing it to scenarios where these key components are excluded, in order to validate the approach in simulations using Matlab and Unity. The results demonstrate significant performance improvements, with 44.0 % increase in effectiveness with the new line-of-sight obstacle avoidance task and up to 39.6 % improvement when using path integral control.
comment: 6 pages, 5 figures, submitted to joint ACC and L-CSS
Energy-Cautious Designation of Kinematic Parameters for a Sustainable Parallel-Serial Heavy-Duty Manipulator Driven by Electromechanical Linear Actuator
Electrification, a key strategy in combating climate change, is transforming industries, and off-highway machines (OHM) will be next to transition from combustion engines and hydraulic actuation to sustainable fully electrified machines. Electromechanical linear actuators (EMLAs) offer superior efficiency, safety, and reduced maintenance, and they unlock vast potential for high-performance autonomous operations. However, a key challenge lies in optimizing the kinematic parameters of OHMs' on-board manipulators for EMLA integration to exploit the full capabilities of actuation systems and maximize their performance. This work addresses this challenge by delving into the structural optimization of a prevalent closed kinematic chain configuration commonly employed in OHM manipulators. Our approach aims to retain the manipulator's existing capabilities while reducing its energy expenditure, paving the way for a greener future in industrial automation, one in which sustainable and high-performing robotized OHMs can evolve. The feasibility of our methodology is validated through simulation results obtained on a commercially available parallel-serial heavy-duty manipulator mounted on a battery electric vehicle. The results demonstrate the efficacy of our approach in modifying kinematic parameters to facilitate the replacement of conventional hydraulic actuators with EMLAs, all while minimizing the overall energy consumption of the system.
comment: This work is accepted for presentation at IEEE VTC 2024-Washington USA
A System of Bidirectional Power Routing Toward Multi-energy Management
In this paper, we propose a system of bidirectional power routing for inter-house multi-energy management systems that utilize electricity and hydrogen as energy carriers. The key is to share private facilities such as photovoltaic panels and batteries among a group of houses along with a common hydrogen system. A power router of line switching type is introduced as a physical interface to realize the sharing economy between households. The proposed system offers a unique measure to address the urgent challenges of today's multi-energy system, namely increasing the renewables' self-consumption, enhancing the energy system's resilience, and providing traceability of hydrogen in terms of renewability certification. We also present an experimental demonstration under a simplified scenario using prototype hardware.
comment: Accepted for presentation at 2024 Annual Conference of the IEEE Industrial Electronics Society
Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks
Distributed stochastic non-convex optimization problems have recently received attention due to the growing interest of signal processing, computer vision, and natural language processing communities in applications deployed over distributed learning systems (e.g., federated learning). We study the setting where the data is distributed across the nodes of a time-varying directed network, a topology suitable for modeling dynamic networks experiencing communication delays and straggler effects. The network nodes, which can access only their local objectives and query a stochastic first-order oracle to obtain gradient estimates, collaborate to minimize a global objective function by exchanging messages with their neighbors. We propose an algorithm, novel to this setting, that leverages stochastic gradient descent with momentum and gradient tracking to solve distributed non-convex optimization problems over time-varying networks. To analyze the algorithm, we tackle the challenges that arise when analyzing dynamic network systems which communicate gradient acceleration components. We prove that the algorithm's oracle complexity is $\mathcal{O}(1/\epsilon^{1.5})$, and that under Polyak-$\L$ojasiewicz condition the algorithm converges linearly to a steady error state. The proposed scheme is tested on several learning tasks: a non-convex logistic regression experiment on the MNIST dataset, an image classification task on the CIFAR-10 dataset, and an NLP classification test on the IMDB dataset. We further present numerical simulations with an objective that satisfies the PL condition. The results demonstrate superior performance of the proposed framework compared to the existing related methods.
comment: This work has been accepted at IEEE Transactions on Automatic Control
A Systematic Review of Edge Case Detection in Automated Driving: Methods, Challenges and Future Directions
The rapid development of automated vehicles (AVs) promises to revolutionize transportation by enhancing safety and efficiency. However, ensuring their reliability in diverse real-world conditions remains a significant challenge, particularly due to rare and unexpected situations known as edge cases. Although numerous approaches exist for detecting edge cases, there is a notable lack of a comprehensive survey that systematically reviews these techniques. This paper fills this gap by presenting a practical, hierarchical review and systematic classification of edge case detection and assessment methodologies. Our classification is structured on two levels: first, categorizing detection approaches according to AV modules, including perception-related and trajectory-related edge cases; and second, based on underlying methodologies and theories guiding these techniques. We extend this taxonomy by introducing a new class called "knowledge-driven" approaches, which is largely overlooked in the literature. Additionally, we review the techniques and metrics for the evaluation of edge case detection methods and identified edge cases. To our knowledge, this is the first survey to comprehensively cover edge case detection methods across all AV subsystems, discuss knowledge-driven edge cases, and explore evaluation techniques for detection methods. This structured and multi-faceted analysis aims to facilitate targeted research and modular testing of AVs. Moreover, by identifying the strengths and weaknesses of various approaches and discussing the challenges and future directions, this survey intends to assist AV developers, researchers, and policymakers in enhancing the safety and reliability of automated driving (AD) systems through effective edge case detection.
comment: Preprint submitted to IEEE Transactions on Intelligent Transportation Systems
Opacity Enforcement by Edit Functions Under Incomparable Observations
As an information-flow privacy property, opacity characterizes whether a malicious external observer (referred to as an intruder) is able to infer the secret behavior of a system. This paper addresses the problem of opacity enforcement using edit functions in discrete event systems modeled by partially observed deterministic finite automata. A defender uses the edit function as an interface at the output of a system to manipulate actual observations through insertion, substitution, and deletion operations so that the intruder will be prevented from inferring the secret behavior of the system. Unlike existing work which usually assumes that the observation capabilities of the intruder and the defender are identical, we consider a more general setting where they may observe incomparable subsets of events generated by the system.To characterize whether the defender has the ability to enforce opacity of the system under this setting, the notion of \emph{$ic$-enforceability} is introduced. Then, the opacity enforcement problem is transformed to a two-player game, with imperfect information between the system and the defender, which can be used to determine a feasible decision-making strategy for the defender. Within the game scheme, an edit mechanism is constructed to enumerate all feasible edit actions following system behavior. We further show that an $ic$-enforcing edit function (if one exists) can be synthesized from the edit mechanism to enforce opacity.
Finite Sample and Large Deviations Analysis of Stochastic Gradient Algorithm with Correlated Noise
We analyze the finite sample regret of a decreasing step size stochastic gradient algorithm. We assume correlated noise and use a perturbed Lyapunov function as a systematic approach for the analysis. Finally we analyze the escape time of the iterates using large deviations theory.
Distributed Adaptive Consensus with Obstacle and Collision Avoidance for Networks of Heterogeneous Multi-Agent Systems
This paper presents a distributed adaptive control strategy for multi-agent systems with heterogeneous dynamics and collision avoidance. We propose an adaptive control strategy designed to ensure leader-following formation consensus while effectively managing collision and obstacle avoidance using potential functions. By integrating neural network-based disturbance estimation and adaptive tuning laws, the proposed strategy ensures consensus and stability in leader-following formations under fixed topologies.
Multi-Mode Inverters: A Unified Control Design for Grid-Forming, Grid-Following, and Beyond
We present a novel, integrated control framework designed to achieve seamless transitions among a spectrum of inverter operation modes. The operation spectrum includes grid-forming (GFM), grid-following (GFL), static synchronous compensator (STATCOM), energy storage system (ESS), and voltage source inverter (VSI). The proposed control architecture offers guarantees of stability, robustness, and performance regardless of the specific mode. The core concept involves establishing a unified algebraic structure for the feedback control system, where different modes are defined by the magnitude of closed-loop signals. As we demonstrate, this approach results in a two-dimensional continuum of operation modes and enables transition trajectories between operation modes by dynamically adjusting closed-loop variables towards corresponding setpoints. Stability, robustness, and fundamental limitation analyses are provided for the closed-loop system across any mode, as well as during transitions between modes. This design facilitates stable and enhanced on-grid integration, even during GFM operation and weak grid conditions. Ultimately, we demonstrate the key attributes of the proposed framework through simulations and experiments, showcasing its seamless transition in on-grid operation, functionality in islanded mode, and robustness to line impedance uncertainty.
comment: 16 pages, 19 figures, submitted to IEEE Transactions on Power Electronics
A Comprehensive Review: Impacts of Extreme Temperatures due to Climate Change on Power Grid Infrastructure and Operation
The power grid is experiencing a multi-fold transformation while the global climate evolves with record-breaking extreme temperatures during heat domes, polar vortexes, and severe ice. Over the decades, these extreme temperature events have increased in frequency, duration, and intensity. The power grid infrastructure is geographically spread over thousands of square miles with millions of small and large components, and the impact of extreme temperature operations on the grid infrastructure needs to be researched further. This paper reviews academic literature, standards, industry articles, and federal reports to identify the impacts of heat domes, polar vortexes, and icing on all the T\&D grid equipment, including substations (assets owned and operated by the utilities and independent system operators). This paper classifies the equipment into primary and auxiliary equipment and determines its vulnerability to extreme temperatures for a deeper analysis of a more critical and vulnerable set of grid equipment. For each equipment under consideration, its fundamental role in the system, the impact of extreme temperatures on its operation, available monitoring, and mitigation of these impacts are discussed. The paper develops insights on standards readiness and identifies gaps concerning extreme temperature definitions. The paper also develops summary tables to identify the critical failure modes for each type of equipment, failure influence diagrams, and cascading influence diagrams to highlight and aid in translating the equipment vulnerability information into power grid contingency definitions that need to be considered in grid planning and operations.
Optimal Interval Observers for Bounded Jacobian Nonlinear Dynamical Systems
In this chapter, we introduce two interval observer designs for discrete-time (DT) and continuous-time (CT) nonlinear systems with bounded Jacobians that are affected by bounded uncertainties. Our proposed methods utilize the concepts of mixed-monotone decomposition and embedding systems to design correct-by-construction interval framers, i.e., the interval framers inherently bound the true state of the system without needing any additional constraints. Further, our methods leverage techniques for positive/cooperative systems to guarantee global uniform ultimate boundedness of the framer error, i.e., the proposed interval observer is input-to-state stable. Specifically, our two interval observer designs minimize the $\mathcal{H}_{\infty}$ and $L_1$ gains, respectively, of the associated linear comparison system of the framer error dynamics. Moreover, our designs adopt a multiple-gain observer structure, which offers additional degrees of freedom, along with coordinate transformations that may improve the feasibility of the resulting optimization programs. We will also discuss and propose computationally tractable optimization formulations to compute the observer gains. Finally, we compare the efficacy of the proposed designs against existing DT and CT interval observers.
comment: Submitted to Springer as a book chapter
Optimal Feedback Stabilizing Control of Bounded Jacobian Discrete-Time Systems via Interval Observers
This paper addresses optimal feedback stabilizing control for bounded Jacobian nonlinear discrete-time (DT) systems with nonlinear observations, affected by state and process noise. Instead of directly stabilizing the uncertain system, we propose stabilizing a higher-dimensional interval observer whose states enclose the true system states. Our nonlinear control approach introduces additional flexibility compared to linear methods, compensating for system nonlinearities and allowing potentially tighter closed-loop intervals. We also establish a separation principle, enabling independent design of observer and control gains, and derive tractable linear matrix inequalities, resulting in a stable closed-loop system.
comment: Submitted to ACC'25
Towards Input-Convex Neural Network Modeling for Battery Optimization in Power Systems
Battery energy storage systems (BESS) play an increasingly vital role in integrating renewable generation into power grids due to their ability to dynamically balance supply. Grid-tied batteries typically employ power converters, where part-load efficiencies vary non-linearly. While this non-linearity can be modeled with high accuracy, it poses challenges for optimization, particularly in ensuring computational tractability. In this paper, we consider a non-linear BESS formulation based on the Energy Reservoir Model (ERM). A data-driven approach is introduced with the input-convex neural network (ICNN) to approximate the nonlinear efficiency with a convex function. The epigraph of the convex function is used to engender a convex program for battery ERM optimization. This relaxed ICNN method is applied to two battery optimization use-cases: PV smoothing and revenue maximization, and it is compared with three other ERM formulations (nonlinear, linear, and mixed-integer). Specifically, ICNN-based methods appear to be promising for future battery optimization with desirable feasibility and optimality outcomes across both use-cases.
Convolutional Neural Network Design and Evaluation for Real-Time Multivariate Time Series Fault Detection in Spacecraft Attitude Sensors
Traditional anomaly detection techniques onboard satellites are based on reliable, yet limited, thresholding mechanisms which are designed to monitor univariate signals and trigger recovery actions according to specific European Cooperation for Space Standardization (ECSS) standards. However, Artificial Intelligence-based Fault Detection, Isolation and Recovery (FDIR) solutions have recently raised with the prospect to overcome the limitations of these standard methods, expanding the range of detectable failures and improving response times. This paper presents a novel approach to detecting stuck values within the Accelerometer and Inertial Measurement Unit of a drone-like spacecraft for the exploration of Small Solar System Bodies (SSSB), leveraging a multi-channel Convolutional Neural Network (CNN) to perform multi-target classification and independently detect faults in the sensors. Significant attention has been dedicated to ensuring the compatibility of the algorithm within the onboard FDIR system, representing a step forward to the in-orbit validation of a technology that remains experimental until its robustness is thoroughly proven. An integration methodology is proposed to enable the network to effectively detect anomalies and trigger recovery actions at the system level. The detection performances and the capability of the algorithm in reaction triggering are evaluated employing a set of custom-defined detection and system metrics, showing the outstanding performances of the algorithm in performing its FDIR task.
comment: submitted to Advances in Space Research
An Ontology-based Approach Towards Traceable Behavior Specifications in Automated Driving
Vehicles in public traffic that are equipped with Automated Driving Systems are subject to a number of expectations: Among other aspects, their behavior should be safe, conforming to the rules of the road and provide mobility to their users. This poses challenges for the developers of such systems: Developers are responsible for specifying this behavior, for example, in terms of requirements at system design time. As we will discuss in the article, this specification always involves the need for assumptions and trade-offs. As a result, insufficiencies in such a behavior specification can occur that can potentially lead to unsafe system behavior. In order to support the identification of specification insufficiencies, requirements and respective assumptions need to be made explicit. In this article, we propose the Semantic Norm Behavior Analysis as an ontology-based approach to specify the behavior for an Automated Driving System equipped vehicle. We use ontologies to formally represent specified behavior for a targeted operational environment, and to establish traceability between specified behavior and the addressed stakeholder needs. Furthermore, we illustrate the application of the Semantic Norm Behavior Analysis in a German legal context with two example scenarios and evaluate our results. Our evaluation shows that the explicit documentation of assumptions in the behavior specification supports both the identification of specification insufficiencies and their treatment. Therefore, this article provides requirements, terminology and an according methodology to facilitate ontology-based behavior specifications in automated driving.
comment: 24 pages, 12 figures, submitted for publication
Social Zone as a Barrier Function for Socially-Compliant Robot Navigation
This study addresses the challenge of integrating social norms into robot navigation, which is essential for ensuring that robots operate safely and efficiently in human-centric environments. Social norms, often unspoken and implicitly understood among people, are difficult to explicitly define and implement in robotic systems. To overcome this, we derive these norms from real human trajectory data, utilizing the comprehensive ATC dataset to identify the minimum social zones humans and robots must respect. These zones are integrated into the robot's navigation system by applying barrier functions, ensuring the robot consistently remains within the designated safety set. Simulation results demonstrate that our system effectively mimics human-like navigation strategies, such as passing on the right side and adjusting speed or pausing in constrained spaces. The proposed framework is versatile, easily comprehensible, and tunable, demonstrating the potential to advance the development of robots designed to navigate effectively in human-centric environments.
Two-Stage Robust Planning Model for Park-Level Integrated Energy System Considering Uncertain Equipment Contingency
To enhance the reliability of Integrated Energy Systems (IESs) and address the research gap in reliability-based planning methods, this paper proposes a two-stage robust planning model specifically for park-level IESs. The proposed planning model considers uncertainties like load demand fluctuations and equipment contingencies, and provides a reliable scheme of equipment selection and sizing for IES investors. Inspired by the unit commitment problem, we formulate an equipment contingency uncertainty set to accurately describe the potential equipment contingencies which happen and can be repaired within a day. Then, a modified nested column-and-constraint generation algorithm is applied to solve this two-stage robust planning model with integer recourse efficiently. In the case study, the role of energy storage system for IES reliability enhancement is analyzed in detail. Computational results demonstrate the advantage of the proposed model over other planning models in terms of improving reliability.
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment EMNLP 2024
Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. To navigate this challenge, we argue the prominence of grounding LLMs with evident preferences. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives, thereby guiding the model to generate responses that meet the requirements. Our experimental analysis reveals that the aligned models can provide responses that match various preferences among the "3H" (helpfulness, honesty, harmlessness) desiderata. Furthermore, by introducing diverse data and alignment goals, we surpass baseline methods in aligning with single objectives, hence mitigating the impact of the alignment tax and achieving improvements in multi-objective alignment.
comment: EMNLP 2024 main conference
Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation
We investigate the real-time voltage regulation problem in distribution systems employing online feedback optimization (OFO) with short-range communication between physical neighbours. OFO does not need an accurate grid model nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties and disturbances, which render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, making them susceptible to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. Numerical study results demonstrate that the proposed design achieves effective voltage regulation and outperforms other distributed and local approaches.
Rapid nonlinear convex guidance using a monomial method
This paper addresses the challenge of accommodating nonlinear dynamics and constraints in rapid trajectory optimization, envisioned for use in the context of onboard guidance. We present a novel framework that uniquely employs overparameterized monomial coordinates and pre-computed fundamental solution expansions to facilitate rapid optimization while minimizing real-time computational requirements. The fundamental solution expansions are pre-computed using differential algebra. Unlike traditional approaches that repeatedly evaluate the nonlinear dynamics and constraints as part of complex shooting or collocation-based schemes, this method replaces the nonlinearity inherent to dynamics and constraint functions entirely with a computationally simpler manifold constraint. With this approach, trajectory optimization is posed efficiently as a path planning problem on the manifold. This problem is entirely convex except for the manifold constraint, readily lending itself to solution via sequential convex programming. We demonstrate the effectiveness of our approach in computing fast and accurate delta-V optimal solutions for long-range spacecraft rendezvous, including problems with nonlinear state constraints.
comment: 38 pages, 16 figures
Active Inverse Learning in Stackelberg Trajectory Games
Game-theoretic inverse learning is the problem of inferring a player's objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates best describes the follower's objective function. Instead of using passively observed trajectories like existing methods, we actively maximize the differences in the follower's trajectories under different hypotheses by optimizing the leader's control inputs. Compared with uniformly random inputs, the optimized inputs accelerate the convergence of the estimated probability of different hypotheses conditioned on the follower's trajectory. We demonstrate the proposed method in a receding-horizon repeated trajectory game and simulate the results using virtual TurtleBots in Gazebo.
comment: 8 pages, 3 figures. Updated previous version to acknowledge funding
Systems and Control (EESS)
Towards a Health-Based Power Grid Optimization in the Artificial Intelligence Era
The electric power sector is one of the largest contributors to greenhouse gas emissions in the world. In recent years, there has been an unprecedented increase in electricity demand driven by the so-called Artificial Intelligence (AI) revolution. Although AI has and will continue to have a transformative impact, its environmental and health impacts are often overlooked. The standard approach to power grid optimization aims to minimize CO$_2$ emissions. In this paper, we propose a new holistic paradigm. Our proposed optimization directly targets the minimization of adverse health outcomes under energy efficiency and emission constraints. We show the first example of an optimal fuel mix allocation problem aiming to minimize the average number of adverse health effects resulting from exposure to hazardous air pollutants with constraints on the average and marginal emissions. We argue that this new health-based power grid optimization is essential to promote truly sustainable technological advances that align both with global climate goals and public health priorities.
comment: 5 pages, 1 figure
Transformer Temperature Management and Voltage Control in Electric Distribution Systems with High Solar PV Penetration
The increasing penetration of photovoltaic (PV) systems in distribution grids can lead to overvoltage and transformer overloading issues. While voltage regulation has been extensively studied and some research has addressed transformer temperature control, there is limited work on simultaneously managing both challenges. This paper addresses this gap by proposing an optimization-based strategy that efficiently manages voltage regulation and transformer temperature while minimizing the curtailment of PV generation. In order to make this problem convex, a relaxation is applied to the transformer temperature dynamics constraint. We also provide analysis to determine under which conditions this relaxation remains tight. The proposed approach is validated through simulations, demonstrating its effectiveness in achieving the desired control objectives.
Hybrid LLM-DDQN based Joint Optimization of V2I Communication and Autonomous Driving
Large language models (LLMs) have received considerable interest recently due to their outstanding reasoning and comprehension capabilities. This work explores applying LLMs to vehicular networks, aiming to jointly optimize vehicle-to-infrastructure (V2I) communications and autonomous driving (AD) policies. We deploy LLMs for AD decision-making to maximize traffic flow and avoid collisions for road safety, and a double deep Q-learning algorithm (DDQN) is used for V2I optimization to maximize the received data rate and reduce frequent handovers. In particular, for LLM-enabled AD, we employ the Euclidean distance to identify previously explored AD experiences, and then LLMs can learn from past good and bad decisions for further improvement. Then, LLM-based AD decisions will become part of states in V2I problems, and DDQN will optimize the V2I decisions accordingly. After that, the AD and V2I decisions are iteratively optimized until convergence. Such an iterative optimization approach can better explore the interactions between LLMs and conventional reinforcement learning techniques, revealing the potential of using LLMs for network optimization and management. Finally, the simulations demonstrate that our proposed hybrid LLM-DDQN approach outperforms the conventional DDQN algorithm, showing faster convergence and higher average rewards.
comment: Submission for possible publication
Robust Variable-Horizon MPC with Adaptive Terminal Constraints
This paper presents a novel robust variable-horizon model predictive control scheme designed to intercept a target moving along a known trajectory, in finite time. Linear discrete-time systems affected by bounded process disturbances are considered and a tube-based MPC approach is adopted. The main contribution is an adaptive mechanism for choosing the terminal constraint set sequence in the MPC optimization problem. This mechanism is designed to ensure recursive feasibility while promoting minimization of the final distance to the target. Finite-time convergence of the proposed control scheme is proven. In order to evaluate its effectiveness, the designed control law is tested through numerical simulations, including a case study involving orbital rendezvous of a satellite with a tumbling object. The results indicate a significant reduction in conservatism compared to existing state-of-the-art methods using a fixed terminal set sequence.
Privacy-Preserving Optimal State Estimation with Low Complexity via Cramér-Rao Lower Bound Approach
This paper addresses the optimal state estimation problem for dynamic systems while preserving private information against an adversary. To dominate the adversary's estimation accuracy about private information in the mean square error (MSE) sense, the Cram\'er-Rao lower bound (CRLB) is employed to evaluate privacy level. The problem is formulated as a constrained optimization, which minimizes the MSE of the state estimate with a constraint on privacy level, achieving a trade-off between privacy and utility. To solve the constrained optimization problem, an explicit expression for CRLB is first provided using the information inequality. To overcome the increasing sizes of the involved matrices over time, a low-complexity approach is then proposed to achieve online calculation for CRLB, significantly reducing computational complexity. Next, the optimization problem is relaxed to a semi-definite programming problem, and a relaxed solution is provided. Finally, a privacy-preserving state estimation algorithm with low complexity is developed and proved to achieve differential privacy. Two illustrative examples, including a practical case of building occupancy, demonstrate the effectiveness of the proposed algorithm.
MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation
Autonomous driving necessitates advanced object detection techniques that integrate information from multiple modalities to overcome the limitations associated with single-modal approaches. The challenges of aligning diverse data in early fusion and the complexities, along with overfitting issues introduced by deep fusion, underscore the efficacy of late fusion at the decision level. Late fusion ensures seamless integration without altering the original detector's network structure. This paper introduces a pioneering Multi-modal Multi-class Late Fusion method, designed for late fusion to enable multi-class detection. Fusion experiments conducted on the KITTI validation and official test datasets illustrate substantial performance improvements, presenting our model as a versatile solution for multi-modal object detection in autonomous driving. Moreover, our approach incorporates uncertainty analysis into the classification fusion process, rendering our model more transparent and trustworthy and providing more reliable insights into category predictions.
Data-driven Feedback Control of Lattice Structures with Localized Actuation and Sensing
Assembling lattices from discrete building blocks enables the composition of large, heterogeneous, and easily reconfigurable objects with desirable mass-to-stiffness ratios. This type of building system may also be referred to as a digital material, as it is constituted from discrete, error-correcting components. Researchers have demonstrated various active structures and even robotic systems that take advantage of the reconfigurable, mass-efficient properties of discrete lattice structures. However, the existing literature has predominantly used open-loop control strategies, limiting the performance of the presented systems. In this paper, we present a novel approach to feedback control of digital lattice structures, leveraging real-time measurements of the system dynamics. We introduce an actuated voxel which constitutes a novel means for actuation of lattice structures. Our control method is based on the Extended Dynamical Mode Decomposition algorithm in conjunction with the Linear Quadratic Regulator and the Koopman Model Predictive Control. The key advantage of our approach lies in its purely data-driven nature, without the need for any prior knowledge of a system's structure. We illustrate the developed method via real experiments with custom-built flexible lattice beam, showing its ability to accomplish various tasks even with minimal sensing and actuation resources. In particular, we address two problems: stabilization together with disturbance attenuation, and reference tracking.
Achieving multi uav best viewpoint coordination in obstructed environments
Wildfire suppression is a complex task that poses high risks to humans. Using robotic teams for wildfire suppression enhances the safety and efficiency of detecting, monitoring, and extinguishing fires. We propose a control architecture based on task hierarchical control for the autonomous steering of a system of flying robots in wildfire suppression. We incorporate a novel line-of-sight obstacle avoidance method that calculates the best viewpoints and ensures an occlusion-free view for the suppression robot during the mission. Path integral control generates optimal trajectories towards the goals. We conduct an ablation study to assess the effectiveness of our approach by comparing it to scenarios where these key components are excluded, in order to validate the approach in simulations using Matlab and Unity. The results demonstrate significant performance improvements, with 44.0 % increase in effectiveness with the new line-of-sight obstacle avoidance task and up to 39.6 % improvement when using path integral control.
comment: 6 pages, 5 figures, submitted to joint ACC and L-CSS
Energy-Cautious Designation of Kinematic Parameters for a Sustainable Parallel-Serial Heavy-Duty Manipulator Driven by Electromechanical Linear Actuator
Electrification, a key strategy in combating climate change, is transforming industries, and off-highway machines (OHM) will be next to transition from combustion engines and hydraulic actuation to sustainable fully electrified machines. Electromechanical linear actuators (EMLAs) offer superior efficiency, safety, and reduced maintenance, and they unlock vast potential for high-performance autonomous operations. However, a key challenge lies in optimizing the kinematic parameters of OHMs' on-board manipulators for EMLA integration to exploit the full capabilities of actuation systems and maximize their performance. This work addresses this challenge by delving into the structural optimization of a prevalent closed kinematic chain configuration commonly employed in OHM manipulators. Our approach aims to retain the manipulator's existing capabilities while reducing its energy expenditure, paving the way for a greener future in industrial automation, one in which sustainable and high-performing robotized OHMs can evolve. The feasibility of our methodology is validated through simulation results obtained on a commercially available parallel-serial heavy-duty manipulator mounted on a battery electric vehicle. The results demonstrate the efficacy of our approach in modifying kinematic parameters to facilitate the replacement of conventional hydraulic actuators with EMLAs, all while minimizing the overall energy consumption of the system.
comment: This work is accepted for presentation at IEEE VTC 2024-Washington USA
A System of Bidirectional Power Routing Toward Multi-energy Management
In this paper, we propose a system of bidirectional power routing for inter-house multi-energy management systems that utilize electricity and hydrogen as energy carriers. The key is to share private facilities such as photovoltaic panels and batteries among a group of houses along with a common hydrogen system. A power router of line switching type is introduced as a physical interface to realize the sharing economy between households. The proposed system offers a unique measure to address the urgent challenges of today's multi-energy system, namely increasing the renewables' self-consumption, enhancing the energy system's resilience, and providing traceability of hydrogen in terms of renewability certification. We also present an experimental demonstration under a simplified scenario using prototype hardware.
comment: Accepted for presentation at 2024 Annual Conference of the IEEE Industrial Electronics Society
Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks
Distributed stochastic non-convex optimization problems have recently received attention due to the growing interest of signal processing, computer vision, and natural language processing communities in applications deployed over distributed learning systems (e.g., federated learning). We study the setting where the data is distributed across the nodes of a time-varying directed network, a topology suitable for modeling dynamic networks experiencing communication delays and straggler effects. The network nodes, which can access only their local objectives and query a stochastic first-order oracle to obtain gradient estimates, collaborate to minimize a global objective function by exchanging messages with their neighbors. We propose an algorithm, novel to this setting, that leverages stochastic gradient descent with momentum and gradient tracking to solve distributed non-convex optimization problems over time-varying networks. To analyze the algorithm, we tackle the challenges that arise when analyzing dynamic network systems which communicate gradient acceleration components. We prove that the algorithm's oracle complexity is $\mathcal{O}(1/\epsilon^{1.5})$, and that under Polyak-$\L$ojasiewicz condition the algorithm converges linearly to a steady error state. The proposed scheme is tested on several learning tasks: a non-convex logistic regression experiment on the MNIST dataset, an image classification task on the CIFAR-10 dataset, and an NLP classification test on the IMDB dataset. We further present numerical simulations with an objective that satisfies the PL condition. The results demonstrate superior performance of the proposed framework compared to the existing related methods.
comment: This work has been accepted at IEEE Transactions on Automatic Control
A Systematic Review of Edge Case Detection in Automated Driving: Methods, Challenges and Future Directions
The rapid development of automated vehicles (AVs) promises to revolutionize transportation by enhancing safety and efficiency. However, ensuring their reliability in diverse real-world conditions remains a significant challenge, particularly due to rare and unexpected situations known as edge cases. Although numerous approaches exist for detecting edge cases, there is a notable lack of a comprehensive survey that systematically reviews these techniques. This paper fills this gap by presenting a practical, hierarchical review and systematic classification of edge case detection and assessment methodologies. Our classification is structured on two levels: first, categorizing detection approaches according to AV modules, including perception-related and trajectory-related edge cases; and second, based on underlying methodologies and theories guiding these techniques. We extend this taxonomy by introducing a new class called "knowledge-driven" approaches, which is largely overlooked in the literature. Additionally, we review the techniques and metrics for the evaluation of edge case detection methods and identified edge cases. To our knowledge, this is the first survey to comprehensively cover edge case detection methods across all AV subsystems, discuss knowledge-driven edge cases, and explore evaluation techniques for detection methods. This structured and multi-faceted analysis aims to facilitate targeted research and modular testing of AVs. Moreover, by identifying the strengths and weaknesses of various approaches and discussing the challenges and future directions, this survey intends to assist AV developers, researchers, and policymakers in enhancing the safety and reliability of automated driving (AD) systems through effective edge case detection.
comment: Preprint submitted to IEEE Transactions on Intelligent Transportation Systems
Opacity Enforcement by Edit Functions Under Incomparable Observations
As an information-flow privacy property, opacity characterizes whether a malicious external observer (referred to as an intruder) is able to infer the secret behavior of a system. This paper addresses the problem of opacity enforcement using edit functions in discrete event systems modeled by partially observed deterministic finite automata. A defender uses the edit function as an interface at the output of a system to manipulate actual observations through insertion, substitution, and deletion operations so that the intruder will be prevented from inferring the secret behavior of the system. Unlike existing work which usually assumes that the observation capabilities of the intruder and the defender are identical, we consider a more general setting where they may observe incomparable subsets of events generated by the system.To characterize whether the defender has the ability to enforce opacity of the system under this setting, the notion of \emph{$ic$-enforceability} is introduced. Then, the opacity enforcement problem is transformed to a two-player game, with imperfect information between the system and the defender, which can be used to determine a feasible decision-making strategy for the defender. Within the game scheme, an edit mechanism is constructed to enumerate all feasible edit actions following system behavior. We further show that an $ic$-enforcing edit function (if one exists) can be synthesized from the edit mechanism to enforce opacity.
Finite Sample and Large Deviations Analysis of Stochastic Gradient Algorithm with Correlated Noise
We analyze the finite sample regret of a decreasing step size stochastic gradient algorithm. We assume correlated noise and use a perturbed Lyapunov function as a systematic approach for the analysis. Finally we analyze the escape time of the iterates using large deviations theory.
Distributed Adaptive Consensus with Obstacle and Collision Avoidance for Networks of Heterogeneous Multi-Agent Systems
This paper presents a distributed adaptive control strategy for multi-agent systems with heterogeneous dynamics and collision avoidance. We propose an adaptive control strategy designed to ensure leader-following formation consensus while effectively managing collision and obstacle avoidance using potential functions. By integrating neural network-based disturbance estimation and adaptive tuning laws, the proposed strategy ensures consensus and stability in leader-following formations under fixed topologies.
Multi-Mode Inverters: A Unified Control Design for Grid-Forming, Grid-Following, and Beyond
We present a novel, integrated control framework designed to achieve seamless transitions among a spectrum of inverter operation modes. The operation spectrum includes grid-forming (GFM), grid-following (GFL), static synchronous compensator (STATCOM), energy storage system (ESS), and voltage source inverter (VSI). The proposed control architecture offers guarantees of stability, robustness, and performance regardless of the specific mode. The core concept involves establishing a unified algebraic structure for the feedback control system, where different modes are defined by the magnitude of closed-loop signals. As we demonstrate, this approach results in a two-dimensional continuum of operation modes and enables transition trajectories between operation modes by dynamically adjusting closed-loop variables towards corresponding setpoints. Stability, robustness, and fundamental limitation analyses are provided for the closed-loop system across any mode, as well as during transitions between modes. This design facilitates stable and enhanced on-grid integration, even during GFM operation and weak grid conditions. Ultimately, we demonstrate the key attributes of the proposed framework through simulations and experiments, showcasing its seamless transition in on-grid operation, functionality in islanded mode, and robustness to line impedance uncertainty.
comment: 16 pages, 19 figures, submitted to IEEE Transactions on Power Electronics
A Comprehensive Review: Impacts of Extreme Temperatures due to Climate Change on Power Grid Infrastructure and Operation
The power grid is experiencing a multi-fold transformation while the global climate evolves with record-breaking extreme temperatures during heat domes, polar vortexes, and severe ice. Over the decades, these extreme temperature events have increased in frequency, duration, and intensity. The power grid infrastructure is geographically spread over thousands of square miles with millions of small and large components, and the impact of extreme temperature operations on the grid infrastructure needs to be researched further. This paper reviews academic literature, standards, industry articles, and federal reports to identify the impacts of heat domes, polar vortexes, and icing on all the T\&D grid equipment, including substations (assets owned and operated by the utilities and independent system operators). This paper classifies the equipment into primary and auxiliary equipment and determines its vulnerability to extreme temperatures for a deeper analysis of a more critical and vulnerable set of grid equipment. For each equipment under consideration, its fundamental role in the system, the impact of extreme temperatures on its operation, available monitoring, and mitigation of these impacts are discussed. The paper develops insights on standards readiness and identifies gaps concerning extreme temperature definitions. The paper also develops summary tables to identify the critical failure modes for each type of equipment, failure influence diagrams, and cascading influence diagrams to highlight and aid in translating the equipment vulnerability information into power grid contingency definitions that need to be considered in grid planning and operations.
Optimal Interval Observers for Bounded Jacobian Nonlinear Dynamical Systems
In this chapter, we introduce two interval observer designs for discrete-time (DT) and continuous-time (CT) nonlinear systems with bounded Jacobians that are affected by bounded uncertainties. Our proposed methods utilize the concepts of mixed-monotone decomposition and embedding systems to design correct-by-construction interval framers, i.e., the interval framers inherently bound the true state of the system without needing any additional constraints. Further, our methods leverage techniques for positive/cooperative systems to guarantee global uniform ultimate boundedness of the framer error, i.e., the proposed interval observer is input-to-state stable. Specifically, our two interval observer designs minimize the $\mathcal{H}_{\infty}$ and $L_1$ gains, respectively, of the associated linear comparison system of the framer error dynamics. Moreover, our designs adopt a multiple-gain observer structure, which offers additional degrees of freedom, along with coordinate transformations that may improve the feasibility of the resulting optimization programs. We will also discuss and propose computationally tractable optimization formulations to compute the observer gains. Finally, we compare the efficacy of the proposed designs against existing DT and CT interval observers.
comment: Submitted to Springer as a book chapter
Optimal Feedback Stabilizing Control of Bounded Jacobian Discrete-Time Systems via Interval Observers
This paper addresses optimal feedback stabilizing control for bounded Jacobian nonlinear discrete-time (DT) systems with nonlinear observations, affected by state and process noise. Instead of directly stabilizing the uncertain system, we propose stabilizing a higher-dimensional interval observer whose states enclose the true system states. Our nonlinear control approach introduces additional flexibility compared to linear methods, compensating for system nonlinearities and allowing potentially tighter closed-loop intervals. We also establish a separation principle, enabling independent design of observer and control gains, and derive tractable linear matrix inequalities, resulting in a stable closed-loop system.
comment: Submitted to ACC'25
Towards Input-Convex Neural Network Modeling for Battery Optimization in Power Systems
Battery energy storage systems (BESS) play an increasingly vital role in integrating renewable generation into power grids due to their ability to dynamically balance supply. Grid-tied batteries typically employ power converters, where part-load efficiencies vary non-linearly. While this non-linearity can be modeled with high accuracy, it poses challenges for optimization, particularly in ensuring computational tractability. In this paper, we consider a non-linear BESS formulation based on the Energy Reservoir Model (ERM). A data-driven approach is introduced with the input-convex neural network (ICNN) to approximate the nonlinear efficiency with a convex function. The epigraph of the convex function is used to engender a convex program for battery ERM optimization. This relaxed ICNN method is applied to two battery optimization use-cases: PV smoothing and revenue maximization, and it is compared with three other ERM formulations (nonlinear, linear, and mixed-integer). Specifically, ICNN-based methods appear to be promising for future battery optimization with desirable feasibility and optimality outcomes across both use-cases.
Convolutional Neural Network Design and Evaluation for Real-Time Multivariate Time Series Fault Detection in Spacecraft Attitude Sensors
Traditional anomaly detection techniques onboard satellites are based on reliable, yet limited, thresholding mechanisms which are designed to monitor univariate signals and trigger recovery actions according to specific European Cooperation for Space Standardization (ECSS) standards. However, Artificial Intelligence-based Fault Detection, Isolation and Recovery (FDIR) solutions have recently raised with the prospect to overcome the limitations of these standard methods, expanding the range of detectable failures and improving response times. This paper presents a novel approach to detecting stuck values within the Accelerometer and Inertial Measurement Unit of a drone-like spacecraft for the exploration of Small Solar System Bodies (SSSB), leveraging a multi-channel Convolutional Neural Network (CNN) to perform multi-target classification and independently detect faults in the sensors. Significant attention has been dedicated to ensuring the compatibility of the algorithm within the onboard FDIR system, representing a step forward to the in-orbit validation of a technology that remains experimental until its robustness is thoroughly proven. An integration methodology is proposed to enable the network to effectively detect anomalies and trigger recovery actions at the system level. The detection performances and the capability of the algorithm in reaction triggering are evaluated employing a set of custom-defined detection and system metrics, showing the outstanding performances of the algorithm in performing its FDIR task.
comment: submitted to Advances in Space Research
An Ontology-based Approach Towards Traceable Behavior Specifications in Automated Driving
Vehicles in public traffic that are equipped with Automated Driving Systems are subject to a number of expectations: Among other aspects, their behavior should be safe, conforming to the rules of the road and provide mobility to their users. This poses challenges for the developers of such systems: Developers are responsible for specifying this behavior, for example, in terms of requirements at system design time. As we will discuss in the article, this specification always involves the need for assumptions and trade-offs. As a result, insufficiencies in such a behavior specification can occur that can potentially lead to unsafe system behavior. In order to support the identification of specification insufficiencies, requirements and respective assumptions need to be made explicit. In this article, we propose the Semantic Norm Behavior Analysis as an ontology-based approach to specify the behavior for an Automated Driving System equipped vehicle. We use ontologies to formally represent specified behavior for a targeted operational environment, and to establish traceability between specified behavior and the addressed stakeholder needs. Furthermore, we illustrate the application of the Semantic Norm Behavior Analysis in a German legal context with two example scenarios and evaluate our results. Our evaluation shows that the explicit documentation of assumptions in the behavior specification supports both the identification of specification insufficiencies and their treatment. Therefore, this article provides requirements, terminology and an according methodology to facilitate ontology-based behavior specifications in automated driving.
comment: 24 pages, 12 figures, submitted for publication
Social Zone as a Barrier Function for Socially-Compliant Robot Navigation
This study addresses the challenge of integrating social norms into robot navigation, which is essential for ensuring that robots operate safely and efficiently in human-centric environments. Social norms, often unspoken and implicitly understood among people, are difficult to explicitly define and implement in robotic systems. To overcome this, we derive these norms from real human trajectory data, utilizing the comprehensive ATC dataset to identify the minimum social zones humans and robots must respect. These zones are integrated into the robot's navigation system by applying barrier functions, ensuring the robot consistently remains within the designated safety set. Simulation results demonstrate that our system effectively mimics human-like navigation strategies, such as passing on the right side and adjusting speed or pausing in constrained spaces. The proposed framework is versatile, easily comprehensible, and tunable, demonstrating the potential to advance the development of robots designed to navigate effectively in human-centric environments.
Two-Stage Robust Planning Model for Park-Level Integrated Energy System Considering Uncertain Equipment Contingency
To enhance the reliability of Integrated Energy Systems (IESs) and address the research gap in reliability-based planning methods, this paper proposes a two-stage robust planning model specifically for park-level IESs. The proposed planning model considers uncertainties like load demand fluctuations and equipment contingencies, and provides a reliable scheme of equipment selection and sizing for IES investors. Inspired by the unit commitment problem, we formulate an equipment contingency uncertainty set to accurately describe the potential equipment contingencies which happen and can be repaired within a day. Then, a modified nested column-and-constraint generation algorithm is applied to solve this two-stage robust planning model with integer recourse efficiently. In the case study, the role of energy storage system for IES reliability enhancement is analyzed in detail. Computational results demonstrate the advantage of the proposed model over other planning models in terms of improving reliability.
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment EMNLP 2024
Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. To navigate this challenge, we argue the prominence of grounding LLMs with evident preferences. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives, thereby guiding the model to generate responses that meet the requirements. Our experimental analysis reveals that the aligned models can provide responses that match various preferences among the "3H" (helpfulness, honesty, harmlessness) desiderata. Furthermore, by introducing diverse data and alignment goals, we surpass baseline methods in aligning with single objectives, hence mitigating the impact of the alignment tax and achieving improvements in multi-objective alignment.
comment: EMNLP 2024 main conference
Distributed Online Feedback Optimization for Real-time Distribution System Voltage Regulation
We investigate the real-time voltage regulation problem in distribution systems employing online feedback optimization (OFO) with short-range communication between physical neighbours. OFO does not need an accurate grid model nor estimated consumption of non-controllable loads, affords fast calculations, and demonstrates robustness to uncertainties and disturbances, which render it particularly suitable for real-time distribution system applications. However, many OFO controllers require centralized communication, making them susceptible to single-point failures. This paper proposes a distributed OFO design based on a nested feedback optimization strategy and analyzes its convergence. Numerical study results demonstrate that the proposed design achieves effective voltage regulation and outperforms other distributed and local approaches.
Rapid nonlinear convex guidance using a monomial method
This paper addresses the challenge of accommodating nonlinear dynamics and constraints in rapid trajectory optimization, envisioned for use in the context of onboard guidance. We present a novel framework that uniquely employs overparameterized monomial coordinates and pre-computed fundamental solution expansions to facilitate rapid optimization while minimizing real-time computational requirements. The fundamental solution expansions are pre-computed using differential algebra. Unlike traditional approaches that repeatedly evaluate the nonlinear dynamics and constraints as part of complex shooting or collocation-based schemes, this method replaces the nonlinearity inherent to dynamics and constraint functions entirely with a computationally simpler manifold constraint. With this approach, trajectory optimization is posed efficiently as a path planning problem on the manifold. This problem is entirely convex except for the manifold constraint, readily lending itself to solution via sequential convex programming. We demonstrate the effectiveness of our approach in computing fast and accurate delta-V optimal solutions for long-range spacecraft rendezvous, including problems with nonlinear state constraints.
comment: 38 pages, 16 figures
Active Inverse Learning in Stackelberg Trajectory Games
Game-theoretic inverse learning is the problem of inferring a player's objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates best describes the follower's objective function. Instead of using passively observed trajectories like existing methods, we actively maximize the differences in the follower's trajectories under different hypotheses by optimizing the leader's control inputs. Compared with uniformly random inputs, the optimized inputs accelerate the convergence of the estimated probability of different hypotheses conditioned on the follower's trajectory. We demonstrate the proposed method in a receding-horizon repeated trajectory game and simulate the results using virtual TurtleBots in Gazebo.
comment: 8 pages, 3 figures. Updated previous version to acknowledge funding
Robotics
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
In this paper, we introduce SPA, a novel representation learning framework that emphasizes the importance of 3D spatial awareness in embodied AI. Our approach leverages differentiable neural rendering on multi-view images to endow a vanilla Vision Transformer (ViT) with intrinsic spatial understanding. We present the most comprehensive evaluation of embodied representation learning to date, covering 268 tasks across 8 simulators with diverse policies in both single-task and language-conditioned multi-task scenarios. The results are compelling: SPA consistently outperforms more than 10 state-of-the-art representation methods, including those specifically designed for embodied AI, vision-centric tasks, and multi-modal applications, while using less training data. Furthermore, we conduct a series of real-world experiments to confirm its effectiveness in practical scenarios. These results highlight the critical role of 3D spatial awareness for embodied representation learning. Our strongest model takes more than 6000 GPU hours to train and we are committed to open-sourcing all code and model weights to foster future research in embodied representation learning. Project Page: https://haoyizhu.github.io/spa/.
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation NeurIPS 2024
In this paper, we propose a new framework for zero-shot object navigation. Existing zero-shot object navigation methods prompt LLM with the text of spatially closed objects, which lacks enough scene context for in-depth reasoning. To better preserve the information of environment and fully exploit the reasoning ability of LLM, we propose to represent the observed scene with 3D scene graph. The scene graph encodes the relationships between objects, groups and rooms with a LLM-friendly structure, for which we design a hierarchical chain-of-thought prompt to help LLM reason the goal location according to scene context by traversing the nodes and edges. Moreover, benefit from the scene graph representation, we further design a re-perception mechanism to empower the object navigation framework with the ability to correct perception error. We conduct extensive experiments on MP3D, HM3D and RoboTHOR environments, where SG-Nav surpasses previous state-of-the-art zero-shot methods by more than 10% SR on all benchmarks, while the decision process is explainable. To the best of our knowledge, SG-Nav is the first zero-shot method that achieves even higher performance than supervised object navigation methods on the challenging MP3D benchmark.
comment: Accepted to NeurIPS 2024. Project page: https://bagh2178.github.io/SG-Nav/
On the Evaluation of Generative Robotic Simulations
Due to the difficulty of acquiring extensive real-world data, robot simulation has become crucial for parallel training and sim-to-real transfer, highlighting the importance of scalable simulated robotic tasks. Foundation models have demonstrated impressive capacities in autonomously generating feasible robotic tasks. However, this new paradigm underscores the challenge of adequately evaluating these autonomously generated tasks. To address this, we propose a comprehensive evaluation framework tailored to generative simulations. Our framework segments evaluation into three core aspects: quality, diversity, and generalization. For single-task quality, we evaluate the realism of the generated task and the completeness of the generated trajectories using large language models and vision-language models. In terms of diversity, we measure both task and data diversity through text similarity of task descriptions and world model loss trained on collected task trajectories. For task-level generalization, we assess the zero-shot generalization ability on unseen tasks of a policy trained with multiple generated tasks. Experiments conducted on three representative task generation pipelines demonstrate that the results from our framework are highly consistent with human evaluations, confirming the feasibility and validity of our approach. The findings reveal that while metrics of quality and diversity can be achieved through certain methods, no single approach excels across all metrics, suggesting a need for greater focus on balancing these different metrics. Additionally, our analysis further highlights the common challenge of low generalization capability faced by current works. Our anonymous website: https://sites.google.com/view/evaltasks.
comment: Project website: https://sites.google.com/view/evaltasks
LiPO: LiDAR Inertial Odometry for ICP Comparison ICRA 2025
We introduce a LiDAR inertial odometry (LIO) framework, called LiPO, that enables direct comparisons of different iterative closest point (ICP) point cloud registration methods. The two common ICP methods we compare are point-to-point (P2P) and point-to-feature (P2F). In our experience, within the context of LIO, P2F-ICP results in less drift and improved mapping accuracy when robots move aggressively through challenging environments when compared to P2P-ICP. However, P2F-ICP methods require more hand-tuned hyper-parameters that make P2F-ICP less general across all environments and motions. In real-world field robotics applications where robots are used across different environments, more general P2P-ICP methods may be preferred despite increased drift. In this paper, we seek to better quantify the trade-off between P2P-ICP and P2F-ICP to help inform when each method should be used. To explore this trade-off, we use LiPO to directly compare ICP methods and test on relevant benchmark datasets as well as on our custom unpiloted ground vehicle (UGV). We find that overall, P2F-ICP has reduced drift and improved mapping accuracy, but, P2P-ICP is more consistent across all environments and motions with minimal drift increase.
comment: Submitted to ICRA 2025
UW-SDF: Exploiting Hybrid Geometric Priors for Neural SDF Reconstruction from Underwater Multi-view Monocular Images IROS 2024
Due to the unique characteristics of underwater environments, accurate 3D reconstruction of underwater objects poses a challenging problem in tasks such as underwater exploration and mapping. Traditional methods that rely on multiple sensor data for 3D reconstruction are time-consuming and face challenges in data acquisition in underwater scenarios. We propose UW-SDF, a framework for reconstructing target objects from multi-view underwater images based on neural SDF. We introduce hybrid geometric priors to optimize the reconstruction process, markedly enhancing the quality and efficiency of neural SDF reconstruction. Additionally, to address the challenge of segmentation consistency in multi-view images, we propose a novel few-shot multi-view target segmentation strategy using the general-purpose segmentation model (SAM), enabling rapid automatic segmentation of unseen objects. Through extensive qualitative and quantitative experiments on diverse datasets, we demonstrate that our proposed method outperforms the traditional underwater 3D reconstruction method and other neural rendering approaches in the field of underwater 3D reconstruction.
comment: 8 pages, 9 figures, presented at IROS 2024
Dynamic Object Catching with Quadruped Robot Front Legs IROS 2024
This paper presents a framework for dynamic object catching using a quadruped robot's front legs while it stands on its rear legs. The system integrates computer vision, trajectory prediction, and leg control to enable the quadruped to visually detect, track, and successfully catch a thrown object using an onboard camera. Leveraging a fine-tuned YOLOv8 model for object detection and a regression-based trajectory prediction module, the quadruped adapts its front leg positions iteratively to anticipate and intercept the object. The catching maneuver involves identifying the optimal catching position, controlling the front legs with Cartesian PD control, and closing the legs together at the right moment. We propose and validate three different methods for selecting the optimal catching position: 1) intersecting the predicted trajectory with a vertical plane, 2) selecting the point on the predicted trajectory with the minimal distance to the center of the robot's legs in their nominal position, and 3) selecting the point on the predicted trajectory with the highest likelihood on a Gaussian Mixture Model (GMM) modelling the robot's reachable space. Experimental results demonstrate robust catching capabilities across various scenarios, with the GMM method achieving the best performance, leading to an 80% catching success rate. A video demonstration of the system in action can be found at https://youtu.be/sm7RdxRfIYg .
comment: Accepted to IROS 2024
Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching
Constrained Reinforcement Learning (CRL) is a subset of machine learning that introduces constraints into the traditional reinforcement learning (RL) framework. Unlike conventional RL which aims solely to maximize cumulative rewards, CRL incorporates additional constraints that represent specific mission requirements or limitations that the agent must comply with during the learning process. In this paper, we address a type of CRL problem where an agent aims to learn the optimal policy to maximize reward while ensuring a desired level of temporal logic constraint satisfaction throughout the learning process. We propose a novel framework that relies on switching between pure learning (reward maximization) and constraint satisfaction. This framework estimates the probability of constraint satisfaction based on earlier trials and properly adjusts the probability of switching between learning and constraint satisfaction policies. We theoretically validate the correctness of the proposed algorithm and demonstrate its performance and scalability through comprehensive simulations.
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
The increasing demand for versatile robotic systems to operate in diverse and dynamic environments has emphasized the importance of a generalist policy, which leverages a large cross-embodiment data corpus to facilitate broad adaptability and high-level reasoning. However, the generalist would struggle with inefficient inference and cost-expensive training. The specialist policy, instead, is curated for specific domain data and excels at task-level precision with efficiency. Yet, it lacks the generalization capacity for a wide range of applications. Inspired by these observations, we introduce RoboDual, a synergistic dual-system that supplements the merits of both generalist and specialist policy. A diffusion transformer-based specialist is devised for multi-step action rollouts, exquisitely conditioned on the high-level task understanding and discretized action output of a vision-language-action (VLA) based generalist. Compared to OpenVLA, RoboDual achieves 26.7% improvement in real-world setting and 12% gain on CALVIN by introducing a specialist policy with merely 20M trainable parameters. It maintains strong performance with 5% of demonstration data only, and enables a 3.8 times higher control frequency in real-world deployment. Code would be made publicly available. Our project page is hosted at: https://opendrivelab.com/RoboDual/
comment: Project page: https://opendrivelab.com/RoboDual/
Fron CAD to URDF: Co-Design of a Jet-Powered Humanoid Robot Including CAD Geometry IROS 2024
Co-design optimization strategies usually rely on simplified robot models extracted from CAD. While these models are useful for optimizing geometrical and inertial parameters for robot control, they might overlook important details essential for prototyping the optimized mechanical design. For instance, they may not account for mechanical stresses exerted on the optimized geometries and the complexity of assembly-level design. In this paper, we introduce a co-design framework aimed at improving both the control performance and mechanical design of our robot. Specifically, we identify the robot links that significantly influence control performance. The geometric characteristics of these links are parameterized and optimized using a multi-objective evolutionary algorithm to achieve optimal control performance. Additionally, an automated Finite Element Method (FEM) analysis is integrated into the framework to filter solutions not satisfying the required structural safety margin. We validate the framework by applying it to enhance the mechanical design for flight performance of the jet-powered humanoid robot iRonCub.
comment: IROS 2024
Multimodal Perception System for Real Open Environment
This paper presents a novel multimodal perception system for a real open environment. The proposed system includes an embedded computation platform, cameras, ultrasonic sensors, GPS, and IMU devices. Unlike the traditional frameworks, our system integrates multiple sensors with advanced computer vision algorithms to help users walk outside reliably. The system can efficiently complete various tasks, including navigating to specific locations, passing through obstacle regions, and crossing intersections. Specifically, we also use ultrasonic sensors and depth cameras to enhance obstacle avoidance performance. The path planning module is designed to find the locally optimal route based on various feedback and the user's current state. To evaluate the performance of the proposed system, we design several experiments under different scenarios. The results show that the system can help users walk efficiently and independently in complex situations.
Understanding Human Activity with Uncertainty Measure for Novelty in Graph Convolutional Networks
Understanding human activity is a crucial aspect of developing intelligent robots, particularly in the domain of human-robot collaboration. Nevertheless, existing systems encounter challenges such as over-segmentation, attributed to errors in the up-sampling process of the decoder. In response, we introduce a promising solution: the Temporal Fusion Graph Convolutional Network. This innovative approach aims to rectify the inadequate boundary estimation of individual actions within an activity stream and mitigate the issue of over-segmentation in the temporal dimension. Moreover, systems leveraging human activity recognition frameworks for decision-making necessitate more than just the identification of actions. They require a confidence value indicative of the certainty regarding the correspondence between observations and training examples. This is crucial to prevent overly confident responses to unforeseen scenarios that were not part of the training data and may have resulted in mismatches due to weak similarity measures within the system. To address this, we propose the incorporation of a Spectral Normalized Residual connection aimed at enhancing efficient estimation of novelty in observations. This innovative approach ensures the preservation of input distance within the feature space by imposing constraints on the maximum gradients of weight updates. By limiting these gradients, we promote a more robust handling of novel situations, thereby mitigating the risks associated with overconfidence. Our methodology involves the use of a Gaussian process to quantify the distance in feature space.
comment: 15 pages, 10 figures, The International Journal of Robotics Research
Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network IROS 2022
Human activities recognition is an important task for an intelligent robot, especially in the field of human-robot collaboration, it requires not only the label of sub-activities but also the temporal structure of the activity. In order to automatically recognize both the label and the temporal structure in sequence of human-object interaction, we propose a novel Pyramid Graph Convolutional Network (PGCN), which employs a pyramidal encoder-decoder architecture consisting of an attention based graph convolution network and a temporal pyramid pooling module for downsampling and upsampling interaction sequence on the temporal axis, respectively. The system represents the 2D or 3D spatial relation of human and objects from the detection results in video data as a graph. To learn the human-object relations, a new attention graph convolutional network is trained to extract condensed information from the graph representation. To segment action into sub-actions, a novel temporal pyramid pooling module is proposed, which upsamples compressed features back to the original time scale and classifies actions per frame. We explore various attention layers, namely spatial attention, temporal attention and channel attention, and combine different upsampling decoders to test the performance on action recognition and segmentation. We evaluate our model on two challenging datasets in the field of human-object interaction recognition, i.e. Bimanual Actions and IKEA Assembly datasets. We demonstrate that our classifier significantly improves both framewise action recognition and segmentation, e.g., F1 micro and F1@50 scores on Bimanual Actions dataset are improved by $4.3\%$ and $8.5\%$ respectively.
comment: 7 pages, 6 figures, IROS 2022 conference
Soothing Sensations: Enhancing Interactions with a Socially Assistive Robot through Vibrotactile Heartbeats
Physical interactions with socially assistive robots (SARs) positively affect user wellbeing. However, haptic experiences when touching a SAR are typically limited to perceiving the robot's movements or shell texture, while other modalities that could enhance the touch experience with the robot, such as vibrotactile stimulation, are under-explored. In this exploratory qualitative study, we investigate the potential of enhancing human interaction with the PARO robot through vibrotactile heartbeats, with the goal to regulate subjective wellbeing during stressful situations. We conducted in-depth one-on-one interviews with 30 participants, who watched three horror movie clips alone, with PARO, and with a PARO that displayed a vibrotactile heartbeat. Our findings show that PARO's presence and its interactive capabilities can help users regulate emotions through attentional redeployment from a stressor toward the robot. The vibrotactile heartbeat further reinforced PARO's physical and social presence, enhancing the socio-emotional support provided by the robot and its perceived life-likeness. We discuss the impact of individual differences in user experience and implications for the future design of life-like vibrotactile stimulation for SARs.
comment: 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
Constrained Skill Discovery: Quadruped Locomotion with Unsupervised Reinforcement Learning
Representation learning and unsupervised skill discovery can allow robots to acquire diverse and reusable behaviors without the need for task-specific rewards. In this work, we use unsupervised reinforcement learning to learn a latent representation by maximizing the mutual information between skills and states subject to a distance constraint. Our method improves upon prior constrained skill discovery methods by replacing the latent transition maximization with a norm-matching objective. This not only results in a much a richer state space coverage compared to baseline methods, but allows the robot to learn more stable and easily controllable locomotive behaviors. We successfully deploy the learned policy on a real ANYmal quadruped robot and demonstrate that the robot can accurately reach arbitrary points of the Cartesian state space in a zero-shot manner, using only an intrinsic skill discovery and standard regularization rewards.
L-VITeX: Light-weight Visual Intuition for Terrain Exploration
This paper presents L-VITeX, a lightweight visual intuition system for terrain exploration designed for resource-constrained robots and swarms. L-VITeX aims to provide a hint of Regions of Interest (RoIs) without computationally expensive processing. By utilizing the Faster Objects, More Objects (FOMO) tinyML architecture, the system achieves high accuracy (>99%) in RoI detection while operating on minimal hardware resources (Peak RAM usage < 50 KB) with near real-time inference (<200 ms). The paper evaluates L-VITeX's performance across various terrains, including mountainous areas, underwater shipwreck debris regions, and Martian rocky surfaces. Additionally, it demonstrates the system's application in 3D mapping using a small mobile robot run by ESP32-Cam and Gaussian Splats (GS), showcasing its potential to enhance exploration efficiency and decision-making.
Synergizing Morphological Computation and Generative Design: Automatic Synthesis of Tendon-Driven Grippers
Robots' behavior and performance are determined both by hardware and software. The design process of robotic systems is a complex journey that involves multiple phases. Throughout this process, the aim is to tackle various criteria simultaneously, even though they often contradict each other. The ultimate goal is to uncover the optimal solution that resolves these conflicting factors. Generative, computation or automatic designs are the paradigms aimed at accelerating the whole design process. Within this paper we propose a design methodology to generate linkage mechanisms for robots with morphological computation. We use a graph grammar and a heuristic search algorithm to create robot mechanism graphs that are converted into simulation models for testing the design output. To verify the design methodology we have applied it to a relatively simple quasi-static problem of object grasping. We found a way to automatically design an underactuated tendon-driven gripper that can grasp a wide range of objects. This is possible because of its structure, not because of sophisticated planning or learning.
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Bimanual manipulation is essential in robotics, yet developing foundation models is extremely challenging due to the inherent complexity of coordinating two robot arms (leading to multi-modal action distributions) and the scarcity of training data. In this paper, we present the Robotics Diffusion Transformer (RDT), a pioneering diffusion foundation model for bimanual manipulation. RDT builds on diffusion models to effectively represent multi-modality, with innovative designs of a scalable Transformer to deal with the heterogeneity of multi-modal inputs and to capture the nonlinearity and high frequency of robotic data. To address data scarcity, we further introduce a Physically Interpretable Unified Action Space, which can unify the action representations of various robots while preserving the physical meanings of original actions, facilitating learning transferrable physical knowledge. With these designs, we managed to pre-train RDT on the largest collection of multi-robot datasets to date and scaled it up to 1.2B parameters, which is the largest diffusion-based foundation model for robotic manipulation. We finally fine-tuned RDT on a self-created multi-task bimanual dataset with over 6K+ episodes to refine its manipulation capabilities. Experiments on real robots demonstrate that RDT significantly outperforms existing methods. It exhibits zero-shot generalization to unseen objects and scenes, understands and follows language instructions, learns new skills with just 1~5 demonstrations, and effectively handles complex, dexterous tasks. We refer to https://rdt-robotics.github.io/rdt-robotics/ for the code and videos.
comment: 10 pages, conference
Online DNN-driven Nonlinear MPC for Stylistic Humanoid Robot Walking with Step Adjustment
This paper presents a three-layered architecture that enables stylistic locomotion with online contact location adjustment. Our method combines an autoregressive Deep Neural Network (DNN) acting as a trajectory generation layer with a model-based trajectory adjustment and trajectory control layers. The DNN produces centroidal and postural references serving as an initial guess and regularizer for the other layers. Being the DNN trained on human motion capture data, the resulting robot motion exhibits locomotion patterns, resembling a human walking style. The trajectory adjustment layer utilizes non-linear optimization to ensure dynamically feasible center of mass (CoM) motion while addressing step adjustments. We compare two implementations of the trajectory adjustment layer: one as a receding horizon planner (RHP) and the other as a model predictive controller (MPC). To enhance MPC performance, we introduce a Kalman filter to reduce measurement noise. The filter parameters are automatically tuned with a Genetic Algorithm. Experimental results on the ergoCub humanoid robot demonstrate the system's ability to prevent falls, replicate human walking styles, and withstand disturbances up to 68 Newton. Website: https://sites.google.com/view/dnn-mpc-walking Youtube video: https://www.youtube.com/watch?v=x3tzEfxO-xQ
comment: This paper has been accepted for publication at the 2024 IEEE-RAS International Conference on Humanoid Robots,(Humanoids) Nancy, France, 2024
SwarmPath: Drone Swarm Navigation through Cluttered Environments Leveraging Artificial Potential Field and Impedance Control
In the area of multi-drone systems, navigating through dynamic environments from start to goal while providing collision-free trajectory and efficient path planning is a significant challenge. To solve this problem, we propose a novel SwarmPath technology that involves the integration of Artificial Potential Field (APF) with Impedance Controller. The proposed approach provides a solution based on collision free leader-follower behaviour where drones are able to adapt themselves to the environment. Moreover, the leader is virtual while drones are physical followers leveraging APF path planning approach to find the smallest possible path to the target. Simultaneously, the drones dynamically adjust impedance links, allowing themselves to create virtual links with obstacles to avoid them. As compared to conventional APF, the proposed SwarmPath system not only provides smooth collision-avoidance but also enable agents to efficiently pass through narrow passages by reducing the total travel time by 30% while ensuring safety in terms of drones connectivity. Lastly, the results also illustrate that the discrepancies between simulated and real environment, exhibit an average absolute percentage error (APE) of 6% of drone trajectories. This underscores the reliability of our solution in real-world scenarios.
comment: Manuscript accepted in IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024)
Autonomous Vehicles Path Planning under Temporal Logic Specifications
Path planning is an essential component of autonomous driving. A global planner is responsible for the high-level planning. It basically performs a shortest-path search on a known map, thereby defining waypoints used to control the local (low-level) planner. Local planning is a runtime verification method which is repeatedly run on the vehicle itself in real-time, so as to find the optimal short-horizon path which leads to the desired waypoint in a way which is both efficient and safe. The challenge is that the local planner has to take into account repeatedly incoming updates about the information available of the environment. In addition, it performs a complex task, as it has to take into account a large variety of requirements, originating from the necessity of collision avoidance with obstacles, respecting traffic rules, sticking to regulatory requirements, and lastly to reach the next waypoint efficiently. In this paper, we describe a logic-based specification mechanism which fulfills all these requirements.
comment: 10 pages, 5 Figures, 1 Table, Accepted as a short paper at 27th Brazilian Symposium on Formal Methods (SBMF 2024)
LaB-CL: Localized and Balanced Contrastive Learning for improving parking slot detection
Parking slot detection is an essential technology in autonomous parking systems. In general, the classification problem of parking slot detection consists of two tasks, a task determining whether localized candidates are junctions of parking slots or not, and the other that identifies a shape of detected junctions. Both classification tasks can easily face biased learning toward the majority class, degrading classification performances. Yet, the data imbalance issue has been overlooked in parking slot detection. We propose the first supervised contrastive learning framework for parking slot detection, Localized and Balanced Contrastive Learning for improving parking slot detection (LaB-CL). The proposed LaB-CL framework uses two main approaches. First, we propose to include class prototypes to consider representations from all classes in every mini batch, from the local perspective. Second, we propose a new hard negative sampling scheme that selects local representations with high prediction error. Experiments with the benchmark dataset demonstrate that the proposed LaB-CL framework can outperform existing parking slot detection methods.
comment: 7 pages, 6 figures
Robotic framework for autonomous manipulation of laboratory equipment with different degrees of transparency via 6D pose estimation
Many modern robotic systems operate autonomously, however they often lack the ability to accurately analyze the environment and adapt to changing external conditions, while teleoperation systems often require special operator skills. In the field of laboratory automation, the number of automated processes is growing, however such systems are usually developed to perform specific tasks. In addition, many of the objects used in this field are transparent, making it difficult to analyze them using visual channels. The contributions of this work include the development of a robotic framework with autonomous mode for manipulating liquid-filled objects with different degrees of transparency in complex pose combinations. The conducted experiments demonstrated the robustness of the designed visual perception system to accurately estimate object poses for autonomous manipulation, and confirmed the performance of the algorithms in dexterous operations such as liquid dispensing. The proposed robotic framework can be applied for laboratory automation, since it allows solving the problem of performing non-trivial manipulation tasks with the analysis of object poses of varying degrees of transparency and liquid levels, requiring high accuracy and repeatability.
comment: Accepted to the 2024 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024), 8 pages, 11 figures
Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning
Soft robots have the potential to revolutionize the use of robotic systems with their capability of establishing safe, robust, and adaptable interactions with their environment, but their precise control remains challenging. In contrast, traditional rigid robots offer high accuracy and repeatability but lack the flexibility of soft robots. We argue that combining these characteristics in a hybrid robotic platform can significantly enhance overall capabilities. This work presents a novel hybrid robotic platform that integrates a rigid manipulator with a fully developed soft arm. This system is equipped with the intelligence necessary to perform flexible and generalizable tasks through imitation learning autonomously. The physical softness and machine learning enable our platform to achieve highly generalizable skills, while the rigid components ensure precision and repeatability.
Neural Semantic Map-Learning for Autonomous Vehicles IROS 2024
Autonomous vehicles demand detailed maps to maneuver reliably through traffic, which need to be kept up-to-date to ensure a safe operation. A promising way to adapt the maps to the ever-changing road-network is to use crowd-sourced data from a fleet of vehicles. In this work, we present a mapping system that fuses local submaps gathered from a fleet of vehicles at a central instance to produce a coherent map of the road environment including drivable area, lane markings, poles, obstacles and more as a 3D mesh. Each vehicle contributes locally reconstructed submaps as lightweight meshes, making our method applicable to a wide range of reconstruction methods and sensor modalities. Our method jointly aligns and merges the noisy and incomplete local submaps using a scene-specific Neural Signed Distance Field, which is supervised using the submap meshes to predict a fused environment representation. We leverage memory-efficient sparse feature-grids to scale to large areas and introduce a confidence score to model uncertainty in scene reconstruction. Our approach is evaluated on two datasets with different local mapping methods, showing improved pose alignment and reconstruction over existing methods. Additionally, we demonstrate the benefit of multi-session mapping and examine the required amount of data to enable high-fidelity map learning for autonomous vehicles.
comment: Accepted at 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Learning Low-Level Causal Relations using a Simulated Robotic Arm ICANN
Causal learning allows humans to predict the effect of their actions on the known environment and use this knowledge to plan the execution of more complex actions. Such knowledge also captures the behaviour of the environment and can be used for its analysis and the reasoning behind the behaviour. This type of knowledge is also crucial in the design of intelligent robotic systems with common sense. In this paper, we study causal relations by learning the forward and inverse models based on data generated by a simulated robotic arm involved in two sensorimotor tasks. As a next step, we investigate feature attribution methods for the analysis of the forward model, which reveals the low-level causal effects corresponding to individual features of the state vector related to both the arm joints and the environment features. This type of analysis provides solid ground for dimensionality reduction of the state representations, as well as for the aggregation of knowledge towards the explainability of causal effects at higher levels.
comment: 14 pages, 5 figures, 3 tables. Appeared in 2024 International Conference on Artificial Neural Networks (ICANN) proceedings. Published version copyrighted by Springer. This work was funded by the Horizon Europe Twinning project TERAIS, G.A. number 101079338 and in part by the Slovak Grant Agency for Science (VEGA), project 1/0373/23
PHODCOS: Pythagorean Hodograph-based Differentiable Coordinate System
This paper presents PHODCOS, an algorithm that assigns a moving coordinate system to a given curve. The parametric functions underlying the coordinate system, i.e., the path function, the moving frame and its angular velocity, are exact -- approximation free -- differentiable, and sufficiently continuous. This allows for computing a coordinate system for highly nonlinear curves, while remaining compliant with autonomous navigation algorithms that require first and second order gradient information. In addition, the coordinate system obtained by PHODCOS is fully defined by a finite number of coefficients, which may then be used to compute additional geometric properties of the curve, such as arc-length, curvature, torsion, etc. Therefore, PHODCOS presents an appealing paradigm to enhance the geometrical awareness of existing guidance and navigation on-orbit spacecraft maneuvers. The PHODCOS algorithm is presented alongside an analysis of its error and approximation order, and thus, it is guaranteed that the obtained coordinate system matches the given curve within a desired tolerance. To demonstrate the applicability of the coordinate system resulting from PHODCOS, we present numerical examples in the Near Rectilinear Halo Orbit (NRHO) for the Lunar Gateway.
comment: Code: https://github.com/jonarriza96/phodcos
Design Method of a Kangaroo Robot with High Power Legs and an Articulated Soft Tail IROS2023
In this paper, we focus on the kangaroo, which has powerful legs capable of jumping and a soft and strong tail. To incorporate these unique structure into a robot for utilization, we propose a design method that takes into account both the feasibility as a robot and the kangaroo-mimetic structure. Based on the kangaroo's musculoskeletal structure, we determine the structure of the robot that enables it to jump by analyzing the muscle arrangement and prior verification in simulation. Also, to realize a tail capable of body support, we use an articulated, elastic structure as a tail. In order to achieve both softness and high power output, the robot is driven by a direct-drive, high-power wire-winding mechanism, and weight of legs and the tail is reduced by placing motors in the torso. The developed kangaroo robot can jump with its hind legs, moving its tail, and supporting its body using its hind legs and tail.
comment: accepted at IROS2023
Lean Methodology for Garment Modernization
Lean Methodology for Garment Modernization. This article presents the lean methodology for modernizing garment manufacturing, focusing on lean thinking, lean practices, automation development, VSM, and CRP, and how to integrate them effectively. While isolated automation of specific operations can improve efficiency and reduce cycle time, it does not necessarily enhance overall garment output and efficiency. To achieve these broader improvements, it is essential to consider the entire production line and process using VSM and CRP to optimize production and center balance. This approach can increase efficiency, and reduce manufacturing costs, labor time, and lead time, ultimately adding value to the company and factory.
comment: 11 pages,7 Figures
Autonomous Driving in Unstructured Environments: How Far Have We Come?
Research on autonomous driving in unstructured outdoor environments is less advanced than in structured urban settings due to challenges like environmental diversities and scene complexity. These environments-such as rural areas and rugged terrains-pose unique obstacles that are not common in structured urban areas. Despite these difficulties, autonomous driving in unstructured outdoor environments is crucial for applications in agriculture, mining, and military operations. Our survey reviews over 250 papers for autonomous driving in unstructured outdoor environments, covering offline mapping, pose estimation, environmental perception, path planning, end-to-end autonomous driving, datasets, and relevant challenges. We also discuss emerging trends and future research directions. This review aims to consolidate knowledge and encourage further research for autonomous driving in unstructured environments. To support ongoing work, we maintain an active repository with up-to-date literature and open-source projects at: https://github.com/chaytonmin/Survey-Autonomous-Driving-in-Unstructured-Environments.
comment: Survey paper; 38 pages
A Visual Cooperative Localization Method for Airborne Magnetic Surveying Based on a Manifold Sensor Fusion Algorithm Using Lie Groups
Recent advancements in UAV technology have spurred interest in developing multi-UAV aerial surveying systems for use in confined environments where GNSS signals are blocked or jammed. This paper focuses airborne magnetic surveying scenarios. To obtain clean magnetic measurements reflecting the Earth's magnetic field, the magnetic sensor must be isolated from other electronic devices, creating a significant localization challenge. We propose a visual cooperative localization solution. The solution incorporates a visual processing module and an improved manifold-based sensor fusion algorithm, delivering reliable and accurate positioning information. Real flight experiments validate the approach, demonstrating single-axis centimeter-level accuracy and decimeter-level overall 3D positioning accuracy.
comment: 12 pages
Toward a Better Understanding of Robot Energy Consumption in Agroecological Applications
In this paper, we present a comprehensive analysis and discussion of energy consumption in agricultural robots. Robots are emerging as a promising solution to address food production and agroecological challenges, offering potential reductions in chemical use and the ability to perform strenuous tasks beyond human capabilities. The automation of agricultural tasks introduces a previously unattainable level of complexity, enabling robots to optimize trajectories, control laws, and overall task planning. Consequently, automation can lead to higher levels of energy optimization in agricultural tasks. However, the energy consumption of robotic platforms is not fully understood, and a deeper analysis of contributing factors is essential to optimize energy use. We analyze the energy data of an automated agricultural tractor performing tasks throughout the year, revealing nontrivial correlations between the robot's velocity, the type of task performed, and energy consumption. This suggests a tradeoff between task efficiency, time to completion, and energy expenditure that can be harnessed to improve the energy efficiency of robotic agricultural operations.
comment: 6 pages, 6 figures
PokeFlex: A Real-World Dataset of Deformable Objects for Robotics
Data-driven methods have shown great potential in solving challenging manipulation tasks, however, their application in the domain of deformable objects has been constrained, in part, by the lack of data. To address this, we propose PokeFlex, a dataset featuring real-world paired and annotated multimodal data that includes 3D textured meshes, point clouds, RGB images, and depth maps. Such data can be leveraged for several downstream tasks such as online 3D mesh reconstruction, and it can potentially enable underexplored applications such as the real-world deployment of traditional control methods based on mesh simulations. To deal with the challenges posed by real-world 3D mesh reconstruction, we leverage a professional volumetric capture system that allows complete 360{\deg} reconstruction. PokeFlex consists of 18 deformable objects with varying stiffness and shapes. Deformations are generated by dropping objects onto a flat surface or by poking the objects with a robot arm. Interaction forces and torques are also reported for the latter case. Using different data modalities, we demonstrated a use case for the PokeFlex dataset in online 3D mesh reconstruction. We refer the reader to our website ( https://pokeflex-dataset.github.io/ ) for demos and examples of our dataset.
The Power of Input: Benchmarking Zero-Shot Sim-To-Real Transfer of Reinforcement Learning Control Policies for Quadrotor Control
In the last decade, data-driven approaches have become popular choices for quadrotor control, thanks to their ability to facilitate the adaptation to unknown or uncertain flight conditions. Among the different data-driven paradigms, Deep Reinforcement Learning (DRL) is currently one of the most explored. However, the design of DRL agents for Micro Aerial Vehicles (MAVs) remains an open challenge. While some works have studied the output configuration of these agents (i.e., what kind of control to compute), there is no general consensus on the type of input data these approaches should employ. Multiple works simply provide the DRL agent with full state information, without questioning if this might be redundant and unnecessarily complicate the learning process, or pose superfluous constraints on the availability of such information in real platforms. In this work, we provide an in-depth benchmark analysis of different configurations of the observation space. We optimize multiple DRL agents in simulated environments with different input choices and study their robustness and their sim-to-real transfer capabilities with zero-shot adaptation. We believe that the outcomes and discussions presented in this work supported by extensive experimental results could be an important milestone in guiding future research on the development of DRL agents for aerial robot tasks.
Patterned Structure Muscle : Arbitrary Shaped Wire-driven Artificial Muscle Utilizing Anisotropic Flexible Structure for Musculoskeletal Robots IROS2024
Muscles of the human body are composed of tiny actuators made up of myosin and actin filaments. They can exert force in various shapes such as curved or flat, under contact forces and deformations from the environment. On the other hand, muscles in musculoskeletal robots so far have faced challenges in generating force in such shapes and environments. To address this issue, we propose Patterned Structure Muscle (PSM), artificial muscles for musculoskeletal robots. PSM utilizes patterned structures with anisotropic characteristics, wire-driven mechanisms, and is made of flexible material Thermoplastic Polyurethane (TPU) using FDM 3D printing. This method enables the creation of various shapes of muscles, such as simple 1 degree-of-freedom (DOF) muscles, Multi-DOF wide area muscles, joint-covering muscles, and branched muscles. We created an upper arm structure using these muscles to demonstrate wide range of motion, lifting heavy objects, and movements through environmental contact. These experiments show that the proposed PSM is capable of operating in various shapes and environments, and is suitable for the muscles of musculoskeletal robots.
comment: accepted at IROS2024
Simplified POMDP Planning with an Alternative Observation Space and Formal Performance Guarantees
Online planning under uncertainty in partially observable domains is an essential capability in robotics and AI. The partially observable Markov decision process (POMDP) is a mathematically principled framework for addressing decision-making problems in this challenging setting. However, finding an optimal solution for POMDPs is computationally expensive and is feasible only for small problems. In this work, we contribute a novel method to simplify POMDPs by switching to an alternative, more compact, observation space and simplified model to speedup planning with formal performance guarantees. We introduce the notion of belief tree topology, which encodes the levels and branches in the tree that use the original and alternative observation space and models. Each belief tree topology comes with its own policy space and planning performance. Our key contribution is to derive bounds between the optimal Q-function of the original POMDP and the simplified tree defined by a given topology with a corresponding simplified policy space. These bounds are then used as an adaptation mechanism between different tree topologies until the optimal action of the original POMDP can be determined. Further, we consider a specific instantiation of our framework, where the alternative observation space and model correspond to a setting where the state is fully observable. We evaluate our approach in simulation, considering exact and approximate POMDP solvers and demonstrating a significant speedup while preserving solution quality. We believe this work opens new exciting avenues for online POMDP planning with formal performance guarantees.
comment: Accepted to ISRR 2024
Stop-N-Go: Search-based Conflict Resolution for Motion Planning of Multiple Robotic Manipulators
We address the motion planning problem for multiple robotic manipulators in packed environments where shared workspace can result in goal positions occupied or blocked by other robots unless those other robots move away to make the goal positions free. While planning in a coupled configuration space (C-space) is straightforward, it struggles to scale with the number of robots and often fails to find solutions. Decoupled planning is faster but frequently leads to conflicts between trajectories. We propose a conflict resolution approach that inserts pauses into individually planned trajectories using an A* search strategy to minimize the makespan--the total time until all robots complete their tasks. This method allows some robots to stop, enabling others to move without collisions, and maintains short distances in the C-space. It also effectively handles cases where goal positions are initially blocked by other robots. Experimental results show that our method successfully solves challenging instances where baseline methods fail to find feasible solutions.
Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers by leveraging observational demonstration data. Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation. This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder, requiring minimal action-labeled data. Through experiments on simulated robot manipulation tasks and a real robot experiment with multi-modal expert demonstrations, we demonstrate that our approach significantly enhances action-data efficiency and achieves high task success rates with limited action data.
Self-Supervised Meta-Learning for All-Layer DNN-Based Adaptive Control with Stability Guarantees
A critical goal of adaptive control is enabling robots to rapidly adapt in dynamic environments. Recent studies have developed a meta-learning-based adaptive control scheme, which uses meta-learning to extract nonlinear features (represented by Deep Neural Networks (DNNs)) from offline data, and uses adaptive control to update linear coefficients online. However, such a scheme is fundamentally limited by the linear parameterization of uncertainties and does not fully unleash the capability of DNNs. This paper introduces a novel learning-based adaptive control framework that pretrains a DNN via self-supervised meta-learning (SSML) from offline trajectories and online adapts the full DNN via composite adaptation. In particular, the offline SSML stage leverages the time consistency in trajectory data to train the DNN to predict future disturbances from history, in a self-supervised manner without environment condition labels. The online stage carefully designs a control law and an adaptation law to update the full DNN with stability guarantees. Empirically, the proposed framework significantly outperforms (19-39%) various classic and learning-based adaptive control baselines, in challenging real-world quadrotor tracking problems under large dynamic wind disturbance.
Streamlined shape of cyborg cockroach promotes traversability in confined environments by gap negotiation
The centimeter-scale cyborg insects have a potential advantage for application in narrow environments where humans cannot operate. To realize such tasks, researchers have developed a small printed-circuit-board (PCB) which an insect can carry and control it. The electronic components usually remain bare on the board and the whole board is mounted on platform animals, resulting in uneven morphology of whole cyborg with sharp edges. It is well known that streamlined body shape in artificial vehicles or robots contributes to effective locomotion by reducing drag force in media. However, little is known how the entire body shape impacts on locomotor performance of cyborg insect. Here, we developed a 10 mm by 10 mm board which provided electrical stimulation via Sub-GHz communication and investigated the impact of physical arrangement of the board using Madagascar hissing cockroach. We compared the success rate of gap negotiation between the cyborg with mounted board and implanted board and found the latter outperformed the former. We demonstrated our cyborg cockroach with implanted board could follow faithfully to the locomotion command via antennal or cercal stimulation and traverse a narrow gap like air vent cover. In contrast to the conventional arrangement, our cyborg insects are suitable for application in a concealed environment.
Force-Centric Imitation Learning with Force-Motion Capture System for Contact-Rich Manipulation ICRA 2025
In most contact-rich manipulation tasks, humans apply time-varying forces to the target object, compensating for inaccuracies in the vision-guided hand trajectory. However, current robot learning algorithms primarily focus on trajectory-based policy, with limited attention given to learning force-related skills. To address this limitation, we introduce ForceMimic, a force-centric robot learning system, providing a natural, force-aware and robot-free robotic demonstration collection system, along with a hybrid force-motion imitation learning algorithm for robust contact-rich manipulation. Using the proposed ForceCapture system, an operator can peel a zucchini in 5 minutes, while force-feedback teleoperation takes over 13 minutes and struggles with task completion. With the collected data, we propose HybridIL to train a force-centric imitation learning model, equipped with hybrid force-position control primitive to fit the predicted wrench-position parameters during robot execution. Experiments demonstrate that our approach enables the model to learn a more robust policy under the contact-rich task of vegetable peeling, increasing the success rates by 54.5% relatively compared to state-of-the-art pure-vision-based imitation learning. Hardware, code, data and more results would be open-sourced on the project website at https://forcemimic.github.io.
comment: 8 pages, 7 figures, submitted to ICRA 2025, project website at https://forcemimic.github.io
G$^{2}$TR: Generalized Grounded Temporal Reasoning for Robot Instruction Following by Combining Large Pre-trained Models
Consider the scenario where a human cleans a table and a robot observing the scene is instructed with the task "Remove the cloth using which I wiped the table". Instruction following with temporal reasoning requires the robot to identify the relevant past object interaction, ground the object of interest in the present scene, and execute the task according to the human's instruction. Directly grounding utterances referencing past interactions to grounded objects is challenging due to the multi-hop nature of references to past interactions and large space of object groundings in a video stream observing the robot's workspace. Our key insight is to factor the temporal reasoning task as (i) estimating the video interval associated with event reference, (ii) performing spatial reasoning over the interaction frames to infer the intended object (iii) semantically track the object's location till the current scene to enable future robot interactions. Our approach leverages existing large pre-trained models (which possess inherent generalization capabilities) and combines them appropriately for temporal grounding tasks. Evaluation on a video-language corpus acquired with a robot manipulator displaying rich temporal interactions in spatially-complex scenes displays an average accuracy of 70.10%. The dataset, code, and videos are available at https://reail-iitdelhi.github.io/temporalreasoning.github.io/ .
Autonomous Robotic System with Optical Coherence Tomography Guidance for Vascular Anastomosis
Vascular anastomosis, the surgical connection of blood vessels, is essential in procedures such as organ transplants and reconstructive surgeries. The precision required limits accessibility due to the extensive training needed, with manual suturing leading to variable outcomes and revision rates up to 7.9%. Existing robotic systems, while promising, are either fully teleoperated or lack the capabilities necessary for autonomous vascular anastomosis. We present the Micro Smart Tissue Autonomous Robot (micro-STAR), an autonomous robotic system designed to perform vascular anastomosis on small-diameter vessels. The micro-STAR system integrates a novel suturing tool equipped with Optical Coherence Tomography (OCT) fiber-optic sensor and a microcamera, enabling real-time tissue detection and classification. Our system autonomously places sutures and manipulates tissue with minimal human intervention. In an ex vivo study, micro-STAR achieved outcomes competitive with experienced surgeons in terms of leak pressure, lumen reduction, and suture placement variation, completing 90% of sutures without human intervention. This represents the first instance of a robotic system autonomously performing vascular anastomosis on real tissue, offering significant potential for improving surgical precision and expanding access to high-quality care.
comment: This paper was submitted to IEEE TMRB and is currently under review. There are 9 pages, 9 figures, and 2 tables
CE-MRS: Contrastive Explanations for Multi-Robot Systems
As the complexity of multi-robot systems grows to incorporate a greater number of robots, more complex tasks, and longer time horizons, the solutions to such problems often become too complex to be fully intelligible to human users. In this work, we introduce an approach for generating natural language explanations that justify the validity of the system's solution to the user, or else aid the user in correcting any errors that led to a suboptimal system solution. Toward this goal, we first contribute a generalizable formalism of contrastive explanations for multi-robot systems, and then introduce a holistic approach to generating contrastive explanations for multi-robot scenarios that selectively incorporates data from multi-robot task allocation, scheduling, and motion-planning to explain system behavior. Through user studies with human operators we demonstrate that our integrated contrastive explanation approach leads to significant improvements in user ability to identify and solve system errors, leading to significant improvements in overall multi-robot team performance.
comment: Accepted to IEEE Robotics and Automation Letters
Flying in air ducts
Air ducts are integral to modern buildings but are challenging to access for inspection. Small quadrotor drones offer a potential solution, as they can navigate both horizontal and vertical sections and smoothly fly over debris. However, hovering inside air ducts is problematic due to the airflow generated by the rotors, which recirculates inside the duct and destabilizes the drone, whereas hovering is a key feature for many inspection missions. In this article, we map the aerodynamic forces that affect a hovering drone in a duct using a robotic setup and a force/torque sensor. Based on the collected aerodynamic data, we identify a recommended position for stable flight, which corresponds to the bottom third for a circular duct. We then develop a neural network-based positioning system that leverages low-cost time-of-flight sensors. By combining these aerodynamic insights and the data-driven positioning system, we show that a small quadrotor drone (here, 180 mm) can hover and fly inside small air ducts, starting with a diameter of 350 mm. These results open a new and promising application domain for drones.
comment: Video: https://youtu.be/BLQqoa7Zolw
Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving? IROS 2024
Within a perception framework for autonomous mobile and robotic systems, semantic analysis of 3D point clouds typically generated by LiDARs is key to numerous applications, such as object detection and recognition, and scene reconstruction. Scene semantic segmentation can be achieved by directly integrating 3D spatial data with specialized deep neural networks. Although this type of data provides rich geometric information regarding the surrounding environment, it also presents numerous challenges: its unstructured and sparse nature, its unpredictable size, and its demanding computational requirements. These characteristics hinder the real-time semantic analysis, particularly on resource-constrained hardware architectures that constitute the main computational components of numerous robotic applications. Therefore, in this paper, we investigate various 3D semantic segmentation methodologies and analyze their performance and capabilities for resource-constrained inference on embedded NVIDIA Jetson platforms. We evaluate them for a fair comparison through a standardized training protocol and data augmentations, providing benchmark results on the Jetson AGX Orin and AGX Xavier series for two large-scale outdoor datasets: SemanticKITTI and nuScenes.
comment: Accepted to IROS 2024 PPNIV Workshop
Safe and Dynamically-Feasible Motion Planning using Control Lyapunov and Barrier Functions
This paper considers the problem of designing motion planning algorithms for control-affine systems that generate collision-free paths from an initial to a final destination and can be executed using safe and dynamically-feasible controllers. We introduce the C-CLF-CBF-RRT algorithm, which produces paths with such properties and leverages rapidly exploring random trees (RRTs), control Lyapunov functions (CLFs) and control barrier functions (CBFs). We show that C-CLF-CBF-RRT is computationally efficient for a variety of different dynamics and obstacles, and establish its probabilistic completeness. We showcase the performance of C-CLF-CBF-RRT in different simulation and hardware experiments.
DTactive: A Vision-Based Tactile Sensor with Active Surface ICRA 2025
The development of vision-based tactile sensors has significantly enhanced robots' perception and manipulation capabilities, especially for tasks requiring contact-rich interactions with objects. In this work, we present DTactive, a novel vision-based tactile sensor with active surfaces. DTactive inherits and modifies the tactile 3D shape reconstruction method of DTact while integrating a mechanical transmission mechanism that facilitates the mobility of its surface. Thanks to this design, the sensor is capable of simultaneously performing tactile perception and in-hand manipulation with surface movement. Leveraging the high-resolution tactile images from the sensor and the magnetic encoder data from the transmission mechanism, we propose a learning-based method to enable precise angular trajectory control during in-hand manipulation. In our experiments, we successfully achieved accurate rolling manipulation within the range of [ -180{\deg},180{\deg} ] on various objects, with the root mean square error between the desired and actual angular trajectories being less than 12{\deg} on nine trained objects and less than 19{\deg} on three novel objects. The results demonstrate the potential of DTactive for in-hand object manipulation in terms of effectiveness, robustness and precision.
comment: Submitted to ICRA 2025
Guiding Collision-Free Humanoid Multi-Contact Locomotion using Convex Kinematic Relaxations and Dynamic Optimization
Humanoid robots rely on multi-contact planners to navigate a diverse set of environments, including those that are unstructured and highly constrained. To synthesize stable multi-contact plans within a reasonable time frame, most planners assume statically stable motions or rely on reduced order models. However, these approaches can also render the problem infeasible in the presence of large obstacles or when operating near kinematic and dynamic limits. To that end, we propose a new multi-contact framework that leverages recent advancements in relaxing collision-free path planning into a convex optimization problem, extending it to be applicable to humanoid multi-contact navigation. Our approach generates near-feasible trajectories used as guides in a dynamic trajectory optimizer, altogether addressing the aforementioned limitations. We evaluate our computational approach showcasing three different-sized humanoid robots traversing a high-raised naval knee-knocker door using our proposed framework in simulation. Our approach can generate motion plans within a few seconds consisting of several multi-contact states, including dynamic feasibility in joint space.
comment: Accepted for publication in IEEE-RAS International Conference of Humanoid Robots (Humanoids 2024)
Modular Adaptive Aerial Manipulation under Unknown Dynamic Coupling Forces
Successful aerial manipulation largely depends on how effectively a controller can tackle the coupling dynamic forces between the aerial vehicle and the manipulator. However, this control problem has remained largely unsolved as the existing control approaches either require precise knowledge of the aerial vehicle/manipulator inertial couplings, or neglect the state-dependent uncertainties especially arising during the interaction phase. This work proposes an adaptive control solution to overcome this long standing control challenge without any a priori knowledge of the coupling dynamic terms. Additionally, in contrast to the existing adaptive control solutions, the proposed control framework is modular, that is, it allows independent tuning of the adaptive gains for the vehicle position sub-dynamics, the vehicle attitude sub-dynamics, and the manipulator sub-dynamics. Stability of the closed loop under the proposed scheme is derived analytically, and real-time experiments validate the effectiveness of the proposed scheme over the state-of-the-art approaches.
FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction
Humans effortlessly integrate common-sense knowledge with sensory input from vision and touch to understand their surroundings. Emulating this capability, we introduce FusionSense, a novel 3D reconstruction framework that enables robots to fuse priors from foundation models with highly sparse observations from vision and tactile sensors. FusionSense addresses three key challenges: (i) How can robots efficiently acquire robust global shape information about the surrounding scene and objects? (ii) How can robots strategically select touch points on the object using geometric and common-sense priors? (iii) How can partial observations such as tactile signals improve the overall representation of the object? Our framework employs 3D Gaussian Splatting as a core representation and incorporates a hierarchical optimization strategy involving global structure construction, object visual hull pruning and local geometric constraints. This advancement results in fast and robust perception in environments with traditionally challenging objects that are transparent, reflective, or dark, enabling more downstream manipulation or navigation tasks. Experiments on real-world data suggest that our framework outperforms previously state-of-the-art sparse-view methods. All code and data are open-sourced on the project website.
ROMAN: Open-Set Object Map Alignment for Robust View-Invariant Global Localization
Global localization is a fundamental capability required for long-term and drift-free robot navigation. However, current methods fail to relocalize when faced with significantly different viewpoints. We present ROMAN (Robust Object Map Alignment Anywhere), a robust global localization method capable of localizing in challenging and diverse environments based on creating and aligning maps of open-set and view-invariant objects. To address localization difficulties caused by feature-sparse or perceptually aliased environments, ROMAN formulates and solves a registration problem between object submaps using a unified graph-theoretic global data association approach that simultaneously accounts for object shape and semantic similarities and a prior on gravity direction. Through a set of challenging large-scale multi-robot or multi-session SLAM experiments in indoor, urban and unstructured/forested environments, we demonstrate that ROMAN achieves a maximum recall 36% higher than other object-based map alignment methods and an absolute trajectory error that is 37% lower than using visual features for loop closures. Our project page can be found at https://acl.mit.edu/ROMAN/.
comment: 8 pages, 7 figures
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
Developing agents capable of navigating to a target location based on language instructions and visual information, known as vision-language navigation (VLN), has attracted widespread interest. Most research has focused on ground-based agents, while UAV-based VLN remains relatively underexplored. Recent efforts in UAV vision-language navigation predominantly adopt ground-based VLN settings, relying on predefined discrete action spaces and neglecting the inherent disparities in agent movement dynamics and the complexity of navigation tasks between ground and aerial environments. To address these disparities and challenges, we propose solutions from three perspectives: platform, benchmark, and methodology. To enable realistic UAV trajectory simulation in VLN tasks, we propose the OpenUAV platform, which features diverse environments, realistic flight control, and extensive algorithmic support. We further construct a target-oriented VLN dataset consisting of approximately 12k trajectories on this platform, serving as the first dataset specifically designed for realistic UAV VLN tasks. To tackle the challenges posed by complex aerial environments, we propose an assistant-guided UAV object search benchmark called UAV-Need-Help, which provides varying levels of guidance information to help UAVs better accomplish realistic VLN tasks. We also propose a UAV navigation LLM that, given multi-view images, task descriptions, and assistant instructions, leverages the multimodal understanding capabilities of the MLLM to jointly process visual and textual information, and performs hierarchical trajectory generation. The evaluation results of our method significantly outperform the baseline models, while there remains a considerable gap between our results and those achieved by human operators, underscoring the challenge presented by the UAV-Need-Help task.
HBTP: Heuristic Behavior Tree Planning with Large Language Model Reasoning
Behavior Trees (BTs) are increasingly becoming a popular control structure in robotics due to their modularity, reactivity, and robustness. In terms of BT generation methods, BT planning shows promise for generating reliable BTs. However, the scalability of BT planning is often constrained by prolonged planning times in complex scenarios, largely due to a lack of domain knowledge. In contrast, pre-trained Large Language Models (LLMs) have demonstrated task reasoning capabilities across various domains, though the correctness and safety of their planning remain uncertain. This paper proposes integrating BT planning with LLM reasoning, introducing Heuristic Behavior Tree Planning (HBTP)-a reliable and efficient framework for BT generation. The key idea in HBTP is to leverage LLMs for task-specific reasoning to generate a heuristic path, which BT planning can then follow to expand efficiently. We first introduce the heuristic BT expansion process, along with two heuristic variants designed for optimal planning and satisficing planning, respectively. Then, we propose methods to address the inaccuracies of LLM reasoning, including action space pruning and reflective feedback, to further enhance both reasoning accuracy and planning efficiency. Experiments demonstrate the theoretical bounds of HBTP, and results from four datasets confirm its practical effectiveness in everyday service robot applications.
Grounding Robot Policies with Visuomotor Language Guidance
Recent advances in the fields of natural language processing and computer vision have shown great potential in understanding the underlying dynamics of the world from large-scale internet data. However, translating this knowledge into robotic systems remains an open challenge, given the scarcity of human-robot interactions and the lack of large-scale datasets of real-world robotic data. Previous robot learning approaches such as behavior cloning and reinforcement learning have shown great capabilities in learning robotic skills from human demonstrations or from scratch in specific environments. However, these approaches often require task-specific demonstrations or designing complex simulation environments, which limits the development of generalizable and robust policies for new settings. Aiming to address these limitations, we propose an agent-based framework for grounding robot policies to the current context, considering the constraints of a current robot and its environment using visuomotor-grounded language guidance. The proposed framework is composed of a set of conversational agents designed for specific roles -- namely, high-level advisor, visual grounding, monitoring, and robotic agents. Given a base policy, the agents collectively generate guidance at run time to shift the action distribution of the base policy towards more desirable future states. We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates both in simulation and in real-world experiments without the need for additional human demonstrations or extensive exploration. Project videos at https://sites.google.com/view/motorcortex/home.
comment: 19 pages, 6 figures, 1 table
Context-Aware Command Understanding for Tabletop Scenarios
This paper presents a novel hybrid algorithm designed to interpret natural human commands in tabletop scenarios. By integrating multiple sources of information, including speech, gestures, and scene context, the system extracts actionable instructions for a robot, identifying relevant objects and actions. The system operates in a zero-shot fashion, without reliance on predefined object models, enabling flexible and adaptive use in various environments. We assess the integration of multiple deep learning models, evaluating their suitability for deployment in real-world robotic setups. Our algorithm performs robustly across different tasks, combining language processing with visual grounding. In addition, we release a small dataset of video recordings used to evaluate the system. This dataset captures real-world interactions in which a human provides instructions in natural language to a robot, a contribution to future research on human-robot interaction. We discuss the strengths and limitations of the system, with particular focus on how it handles multimodal command interpretation, and its ability to be integrated into symbolic robotic frameworks for safe and explainable decision-making.
An Algorithm for Distributed Computation of Reachable Sets for Multi-Agent Systems
In this paper, we consider the problem of distributed reachable set computation for multi-agent systems (MASs) interacting over an undirected, stationary graph. A full state-feedback control input for such MASs depends no only on the current agent's state, but also of its neighbors. However, in most MAS applications, the dynamics are obscured by individual agents. This makes reachable set computation, in a fully distributed manner, a challenging problem. We utilize the ideas of polytopic reachable set approximation and generalize it to a MAS setup. We formulate the resulting sub-problems in a fully distributed manner and provide convergence guarantees for the associated computations. The proposed algorithm's convergence is proved for two cases: static MAS graphs, and time-varying graphs under certain restrictions.
comment: 10 pages, 4 figures, 1 algorithm float. Preprint submitted to ACC 2025 for review
Demonstration Based Explainable AI for Learning from Demonstration Methods
Learning from Demonstration (LfD) is a powerful type of machine learning that can allow novices to teach and program robots to complete various tasks. However, the learning process for these systems may still be difficult for novices to interpret and understand, making effective teaching challenging. Explainable artificial intelligence (XAI) aims to address this challenge by explaining a system to the user. In this work, we investigate XAI within LfD by implementing an adaptive explanatory feedback system on an inverse reinforcement learning (IRL) algorithm. The feedback is implemented by demonstrating selected learnt trajectories to users. The system adapts to user teaching by categorizing and then selectively sampling trajectories shown to a user, to show a representative sample of both successful and unsuccessful trajectories. The system was evaluated through a user study with 26 participants teaching a robot a navigation task. The results of the user study demonstrated that the proposed explanatory feedback system can improve robot performance, teaching efficiency and user understanding of the robot.
comment: 8 Pages, 9 Figures, 2 Tables
A Planar-Symmetric SO(3) Representation for Learning Grasp Detection
Planar-symmetric hands, such as parallel grippers, are widely adopted in both research and industrial fields. Their symmetry, however, introduces ambiguity and discontinuity in the SO(3) representation, which hinders both the training and inference of neural-network-based grasp detectors. We propose a novel SO(3) representation that can parametrize a pair of planar-symmetric poses with a single parameter set by leveraging the 2D Bingham distribution. We also detail a grasp detector based on our representation, which provides a more consistent rotation output. An intensive evaluation with multiple grippers and objects in both the simulation and the real world quantitatively shows our approach's contribution.
comment: Accepted by CoRL2024
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Vision-based robot policy learning, which maps visual inputs to actions, necessitates a holistic understanding of diverse visual tasks beyond single-task needs like classification or segmentation. Inspired by this, we introduce Theia, a vision foundation model for robot learning that distills multiple off-the-shelf vision foundation models trained on varied vision tasks. Theia's rich visual representations encode diverse visual knowledge, enhancing downstream robot learning. Extensive experiments demonstrate that Theia outperforms its teacher models and prior robot learning models using less training data and smaller model sizes. Additionally, we quantify the quality of pre-trained visual representations and hypothesize that higher entropy in feature norm distributions leads to improved robot learning performance. Code, models, and demo are available at https://theia.theaiinstitute.com.
comment: CoRL 2024
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
Robot behavior policies trained via imitation learning are prone to failure under conditions that deviate from their training data. Thus, algorithms that monitor learned policies at test time and provide early warnings of failure are necessary to facilitate scalable deployment. We propose Sentinel, a runtime monitoring framework that splits the detection of failures into two complementary categories: 1) Erratic failures, which we detect using statistical measures of temporal action consistency, and 2) task progression failures, where we use Vision Language Models (VLMs) to detect when the policy confidently and consistently takes actions that do not solve the task. Our approach has two key strengths. First, because learned policies exhibit diverse failure modes, combining complementary detectors leads to significantly higher accuracy at failure detection. Second, using a statistical temporal action consistency measure ensures that we quickly detect when multimodal, generative policies exhibit erratic behavior at negligible computational cost. In contrast, we only use VLMs to detect failure modes that are less time-sensitive. We demonstrate our approach in the context of diffusion policies trained on robotic mobile manipulation domains in both simulation and the real world. By unifying temporal consistency detection and VLM runtime monitoring, Sentinel detects 18% more failures than using either of the two detectors alone and significantly outperforms baselines, thus highlighting the importance of assigning specialized detectors to complementary categories of failure. Qualitative results are made available at https://sites.google.com/stanford.edu/sentinel.
comment: Project page: https://sites.google.com/stanford.edu/sentinel. 35 pages, 9 figures. Accepted to the Conference on Robot Learning (CoRL) 2024
Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models
In the field of robotics and computer vision, efficient and accurate semantic mapping remains a significant challenge due to the growing demand for intelligent machines that can comprehend and interact with complex environments. Conventional panoptic mapping methods, however, are limited by predefined semantic classes, thus making them ineffective for handling novel or unforeseen objects. In response to this limitation, we introduce the Unified Promptable Panoptic Mapping (UPPM) method. UPPM utilizes recent advances in foundation models to enable real-time, on-demand label generation using natural language prompts. By incorporating a dynamic labeling strategy into traditional panoptic mapping techniques, UPPM provides significant improvements in adaptability and versatility while maintaining high performance levels in map reconstruction. We demonstrate our approach on real-world and simulated datasets. Results show that UPPM can accurately reconstruct scenes and segment objects while generating rich semantic labels through natural language interactions. A series of ablation experiments validated the advantages of foundation model-based labeling over fixed label sets.
comment: This paper is under consideration at Pattern Recognition Letters
Improving Robotic Arms through Natural Language Processing, Computer Vision, and Edge Computing
This paper introduces a prototype for a new approach to assistive robotics, integrating edge computing with Natural Language Processing (NLP) and computer vision to enhance the interaction between humans and robotic systems. Our proof of concept demonstrates the feasibility of using large language models (LLMs) and vision systems in tandem for interpreting and executing complex commands conveyed through natural language. This integration aims to improve the intuitiveness and accessibility of assistive robotic systems, making them more adaptable to the nuanced needs of users with disabilities. By leveraging the capabilities of edge computing, our system has the potential to minimize latency and support offline capability, enhancing the autonomy and responsiveness of assistive robots. Experimental results from our implementation on a robotic arm show promising outcomes in terms of accurate intent interpretation and object manipulation based on verbal commands. This research lays the groundwork for future developments in assistive robotics, focusing on creating highly responsive, user-centric systems that can significantly improve the quality of life for individuals with disabilities. For video demonstrations and source code, please refer to: https://tinyurl.com/EnhancedArmEdgeNLP.
Deployment of Large Language Models to Control Mobile Robots at the Edge
This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated. The study specifically contrasts the performance of GPT-4-Turbo, which requires cloud connectivity, with an offline-capable, quantized version of LLaMA 2 (LLaMA 2-7B.Q5 K M). These results show that GPT-4-Turbo delivers superior performance in interpreting and executing complex commands accurately, whereas LLaMA 2 exhibits significant limitations in consistency and reliability of command execution. Communication between the control computer and the mobile robot is established via a Raspberry Pi Pico W, which wirelessly receives commands from the computer without internet dependency and transmits them through a wired connection to the robot's Arduino controller. This study highlights the potential and challenges of implementing LLMs and NLP at the edge, providing groundwork for future research into fully autonomous and network-independent robotic systems. For video demonstrations and source code, please refer to: https://tinyurl.com/MobileRobotGPT4LLaMA2024.
AO-Grasp: Articulated Object Grasp Generation
We introduce AO-Grasp, a grasp proposal method that generates 6 DoF grasps that enable robots to interact with articulated objects, such as opening and closing cabinets and appliances. AO-Grasp consists of two main contributions: the AO-Grasp Model and the AO-Grasp Dataset. Given a segmented partial point cloud of a single articulated object, the AO-Grasp Model predicts the best grasp points on the object with an Actionable Grasp Point Predictor. Then, it finds corresponding grasp orientations for each of these points, resulting in stable and actionable grasp proposals. We train the AO-Grasp Model on our new AO-Grasp Dataset, which contains 78K actionable parallel-jaw grasps on synthetic articulated objects. In simulation, AO-Grasp achieves a 45.0 % grasp success rate, whereas the highest performing baseline achieves a 35.0% success rate. Additionally, we evaluate AO-Grasp on 120 real-world scenes of objects with varied geometries, articulation axes, and joint states, where AO-Grasp produces successful grasps on 67.5% of scenes, while the baseline only produces successful grasps on 33.3% of scenes. To the best of our knowledge, AO-Grasp is the first method for generating 6 DoF grasps on articulated objects directly from partial point clouds without requiring part detection or hand-designed grasp heuristics. Project website: https://stanford-iprl-lab.github.io/ao-grasp
comment: Project website: https://stanford-iprl-lab.github.io/ao-grasp
Large Language Models for Orchestrating Bimanual Robots
Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in effective temporal and spatial coordination. With emergent abilities in terms of step-by-step reasoning and in-context learning, Large Language Models (LLMs) have demonstrated promising potential in a variety of robotic tasks. However, the nature of language communication via a single sequence of discrete symbols makes LLM-based coordination in continuous space a particular challenge for bimanual tasks. To tackle this challenge, we present LAnguage-model-based Bimanual ORchestration (LABOR), an agent utilizing an LLM to analyze task configurations and devise coordination control policies for addressing long-horizon bimanual tasks. We evaluate our method through simulated experiments involving two classes of long-horizon tasks using the NICOL humanoid robot. Our results demonstrate that our method outperforms the baseline in terms of success rate. Additionally, we thoroughly analyze failure cases, offering insights into LLM-based approaches in bimanual robotic control and revealing future research trends. The project website can be found at http://labor-agent.github.io.
comment: Accepted in Humanoids 2024. The project website can be found at http://labor-agent.github.io
Hybrid Gripper with Passive Pneumatic Soft Joints for Grasping Deformable Thin Objects
Grasping a variety of objects remains a key challenge in the development of versatile robotic systems. The human hand is remarkably dexterous, capable of grasping and manipulating objects with diverse shapes, mechanical properties, and textures. Inspired by how humans use two fingers to pick up thin and large objects such as fabric or sheets of paper, we aim to develop a gripper optimized for grasping such deformable objects. Observing how the soft and flexible fingertip joints of the hand approach and grasp thin materials, a hybrid gripper design that incorporates both soft and rigid components was proposed. The gripper utilizes a soft pneumatic ring wrapped around a rigid revolute joint to create a flexible two-fingered gripper. Experiments were conducted to characterize and evaluate the gripper performance in handling sheets of paper and other objects. Compared to rigid grippers, the proposed design improves grasping efficiency and reduces the gripping distance by up to eightfold.
DragTraffic: Interactive and Controllable Traffic Scene Generation for Autonomous Driving
Evaluating and training autonomous driving systems require diverse and scalable corner cases. However, most existing scene generation methods lack controllability, accuracy, and versatility, resulting in unsatisfactory generation results. Inspired by DragGAN in image generation, we propose DragTraffic, a generalized, interactive, and controllable traffic scene generation framework based on conditional diffusion. DragTraffic enables non-experts to generate a variety of realistic driving scenarios for different types of traffic agents through an adaptive mixture expert architecture. We employ a regression model to provide a general initial solution and a refinement process based on the conditional diffusion model to ensure diversity. User-customized context is introduced through cross-attention to ensure high controllability. Experiments on a real-world driving dataset show that DragTraffic outperforms existing methods in terms of authenticity, diversity, and freedom. Demo videos and code are available at https://chantsss.github.io/Dragtraffic/.
Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction
This paper addresses task planning problems for language-instructed robot teams. Tasks are expressed in natural language (NL), requiring the robots to apply their capabilities at various locations and semantic objects. Several recent works have addressed similar planning problems by leveraging pre-trained Large Language Models (LLMs) to design effective multi-robot plans. However, these approaches lack mission completion guarantees. To address this challenge, we introduce a new distributed LLM-based planner, called S-ATLAS for Safe plAnning for Teams of Language-instructed AgentS, that is capable of achieving user-defined mission success rates. This is accomplished by leveraging conformal prediction (CP), a distribution-free uncertainty quantification tool in black-box models. CP allows the proposed multi-robot planner to reason about its inherent uncertainty in a distributed fashion, enabling robots to make individual decisions when they are sufficiently certain and seek help otherwise. We show, both theoretically and empirically, that the proposed planner can achieve user-specified task success rates while minimizing the overall number of help requests. We provide comparative experiments against related works showing that our method is significantly more computational efficient and achieves lower help rates. The advantage of our algorithm over baselines becomes more pronounced with increasing robot team size.
Semantic Region Aware Autonomous Exploration for Multi-Type Map Construction in Unknown Indoor Environments
Mainstream autonomous exploration methods usually perform excessively-repeated explorations for the same region, leading to long exploration time and exploration trajectory in complex scenes. To handle this issue, we propose a novel semantic region aware autonomous exploration method, the core idea of which is considering the information of semantic regions to optimize the autonomous navigation strategy. Our method enables the mobile robot to fully explore the current semantic region before moving to the next region, contributing to avoid excessively-repeated explorations and accelerate the exploration speed. In addition, compared with existing au?tonomous exploration methods that usually construct the single-type map, our method allows to construct four types of maps including point cloud map, occupancy grid map, topological map, and semantic map. The experiment results demonstrate that our method achieves the highest 50.7% exploration time reduction and 48.1% exploration trajectory length reduction while maintaining >98% exploration rate when comparing with the classical RRT (Rapid-exploration Random Tree) based autonomous exploration method.
Co-Design Optimisation of Morphing Topology and Control of Winged Drones
The design and control of winged aircraft and drones is an iterative process aimed at identifying a compromise of mission-specific costs and constraints. When agility is required, shape-shifting (morphing) drones represent an efficient solution. However, morphing drones require the addition of actuated joints that increase the topology and control coupling, making the design process more complex. We propose a co-design optimisation method that assists the engineers by proposing a morphing drone's conceptual design that includes topology, actuation, morphing strategy, and controller parameters. The method consists of applying multi-objective constraint-based optimisation to a multi-body winged drone with trajectory optimisation to solve the motion intelligence problem under diverse flight mission requirements, such as energy consumption and mission completion time. We show that co-designed morphing drones outperform fixed-winged drones in terms of energy efficiency and mission time, suggesting that the proposed co-design method could be a useful addition to the aircraft engineering toolbox.
C$^3$P-VoxelMap: Compact, Cumulative and Coalescible Probabilistic Voxel Mapping
This work presents a compact, cumulative and coalescible probabilistic voxel mapping method to enhance performance, accuracy and memory efficiency in LiDAR odometry. Probabilistic voxel mapping requires storing past point clouds and re-iterating on them to update the uncertainty every iteration, which consumes large memory space and CPU cycles. To solve this problem, we propose a two-folded strategy. First, we introduce a compact point-free representation for probabilistic voxels and derive a cumulative update of the planar uncertainty without caching original point clouds. Our voxel structure only keeps track of a predetermined set of statistics for points that lie inside it. This method reduces the runtime complexity from $O(MN)$ to $O(N)$ and the space complexity from $O(N)$ to $O(1)$ where $M$ is the number of iterations and $N$ is the number of points. Second, to further minimize memory usage and enhance mapping accuracy, we provide a strategy to dynamically merge voxels associated with the same physical planes by taking advantage of the geometric features in the real world. Rather than scanning for these coalescible voxels constantly at every iteration, our merging strategy accumulates voxels in a locality-sensitive hash and triggers merging lazily. On-demand merging not only reduces memory footprint with minimal computational overhead but also improves localization accuracy thanks to cross-voxel denoising. Experiments exhibit 20% higher accuracy, 20% faster performance and 70% lower memory consumption than the state-of-the-art.
Learning to Plan Maneuverable and Agile Flight Trajectory with Optimization Embedded Networks
In recent times, an increasing number of researchers have been devoted to utilizing deep neural networks for end-to-end flight navigation. This approach has gained traction due to its ability to bridge the gap between perception and planning that exists in traditional methods, thereby eliminating delays between modules. However, the practice of replacing original modules with neural networks in a black-box manner diminishes the overall system's robustness and stability. It lacks principled explanations and often fails to consistently generate high-quality motion trajectories. Furthermore, such methods often struggle to rigorously account for the robot's kinematic constraints, resulting in the generation of trajectories that cannot be executed satisfactorily. In this work, we combine the advantages of traditional methods and neural networks by proposing an optimization-embedded neural network. This network can learn high-quality trajectories directly from visual inputs without the need of mapping, while ensuring dynamic feasibility. Here, the deep neural network is employed to directly extract environment safety regions from depth images. Subsequently, we employ a model-based approach to represent these regions as safety constraints in trajectory optimization. Leveraging the availability of highly efficient optimization algorithms, our method robustly converges to feasible and optimal solutions that satisfy various user-defined constraints. Moreover, we differentiate the optimization process, allowing it to be trained as a layer within the neural network. This approach facilitates the direct interaction between perception and planning, enabling the network to focus more on the spatial regions where optimal solutions exist. As a result, it further enhances the quality and stability of the generated trajectories.
comment: Some statements in the introduction may be controversial
Open-Vocabulary Action Localization with Iterative Visual Prompting
Video action localization aims to find the timings of specific actions from a long video. Although existing learning-based approaches have been successful, they require annotating videos, which comes with a considerable labor cost. This paper proposes a learning-free, open-vocabulary approach based on emerging off-the-shelf vision-language models (VLMs). The challenge stems from the fact that VLMs are neither designed to process long videos nor tailored for finding actions. We overcome these problems by extending an iterative visual prompting technique. Specifically, we sample video frames and create a concatenated image with frame index labels, making a VLM guess a frame that is considered to be closest to the start and end of the action. Iterating this process by narrowing a sampling time window results in finding the specific frames corresponding to the start and end of an action. We demonstrate that this technique yields reasonable performance, achieving results comparable to state-of-the-art zero-shot action localization. These results illustrate a practical extension of VLMs for understanding videos. A sample code is available at https://microsoft.github.io/VLM-Video-Action-Localization/.
comment: 9 pages, 5 figures, 6 tables. Last updated on October 9th, 2024
SE(3) Linear Parameter Varying Dynamical Systems for Globally Asymptotically Stable End-Effector Control
Linear Parameter Varying Dynamical Systems (LPV-DS) encode trajectories into an autonomous first-order DS that enables reactive responses to perturbations, while ensuring globally asymptotic stability at the target. However, the current LPV-DS framework is established on Euclidean data only and has not been applicable to broader robotic applications requiring pose control. In this paper we present an extension to the current LPV-DS framework, named Quaternion-DS, which efficiently learns a DS-based motion policy for orientation. Leveraging techniques from differential geometry and Riemannian statistics, our approach properly handles the non-Euclidean orientation data in quaternion space, enabling the integration with positional control, namely SE(3) LPV-DS, so that the synergistic behaviour within the full SE(3) pose is preserved. Through simulation and real robot experiments, we validate our method, demonstrating its ability to efficiently and accurately reproduce the original SE(3) trajectory while exhibiting strong robustness to perturbations in task space.
Cross-Embodied Affordance Transfer through Learning Affordance Equivalences
Affordances represent the inherent effect and action possibilities that objects offer to the agents within a given context. From a theoretical viewpoint, affordances bridge the gap between effect and action, providing a functional understanding of the connections between the actions of an agent and its environment in terms of the effects it can cause. In this study, we propose a deep neural network model that unifies objects, actions, and effects into a single latent vector in a common latent space that we call the affordance space. Using the affordance space, our system can generate effect trajectories when action and object are given and can generate action trajectories when effect trajectories and objects are given. Our model does not learn the behavior of individual objects acted upon by a single agent. Still, rather, it forms a `shared affordance representation' spanning multiple agents and objects, which we call Affordance Equivalence. Affordance Equivalence facilitates not only action generalization over objects but also Cross Embodiment transfer linking actions of different robots. In addition to the simulation experiments that demonstrate the proposed model's range of capabilities, we also showcase that our model can be used for direct imitation in real-world settings.
comment: 10 pages, 9 figures, Submitted to IEEE Transactions on Cognitive and Developmental Systems
CitDet: A Benchmark Dataset for Citrus Fruit Detection
In this letter, we present a new dataset to advance the state of the art in detecting citrus fruit and accurately estimate yield on trees affected by the Huanglongbing (HLB) disease in orchard environments via imaging. Despite the fact that significant progress has been made in solving the fruit detection problem, the lack of publicly available datasets has complicated direct comparison of results. For instance, citrus detection has long been of interest to the agricultural research community, yet there is an absence of work, particularly involving public datasets of citrus affected by HLB. To address this issue, we enhance state-of-the-art object detection methods for use in typical orchard settings. Concretely, we provide high-resolution images of citrus trees located in an area known to be highly affected by HLB, along with high-quality bounding box annotations of citrus fruit. Fruit on both the trees and the ground are labeled to allow for identification of fruit location, which contributes to advancements in yield estimation and potential measure of HLB impact via fruit drop. The dataset consists of over 32,000 bounding box annotations for fruit instances contained in 579 high-resolution images. In summary, our contributions are the following: (i) we introduce a novel dataset along with baseline performance benchmarks on multiple contemporary object detection algorithms, (ii) we show the ability to accurately capture fruit location on tree or on ground, and finally (ii) we present a correlation of our results with yield estimations.
comment: To be published in IEEE Robotics and Automation Letters (RA-L)
On the Feedback Law in Stochastic Optimal Nonlinear Control
We consider the problem of nonlinear stochastic optimal control. This problem is thought to be fundamentally intractable owing to Bellman's "curse of dimensionality". We present a result that shows that repeatedly solving an open-loop deterministic problem from the current state with progressively shorter horizons, similar to Model Predictive Control (MPC), results in a feedback policy that is $O(\epsilon^4)$ near to the true global stochastic optimal policy, where $\epsilon$ is a perturbation parameter modulating the noise. We also show that the optimal deterministic feedback problem has a perturbation structure such that higher-order terms of the feedback law do not affect lower-order terms and that this structure is lost in the optimal stochastic feedback problem. Consequently, solving the Stochastic Dynamic Programming problem is highly susceptible to noise, even in low dimensional problems, and in practice, the MPC-type feedback law offers superior performance even for high noise levels.
comment: arXiv admin note: substantial text overlap with arXiv:2002.10505, arXiv:2002.09478
NAS: N-step computation of All Solutions to the footstep planning problem
How many ways are there to climb a staircase in a given number of steps? Infinitely many, if we focus on the continuous aspect of the problem. A finite, possibly large number if we consider the discrete aspect, \emph{i.e.} on which surface which effectors are going to step and in what order. We introduce NAS, an algorithm that considers both aspects simultaneously and computes \emph{all} the possible solutions to such a contact planning problem, under standard assumptions. To our knowledge NAS is the first algorithm to produce a globally optimal policy, efficiently queried in real time for planning the next footsteps of a humanoid robot. Our empirical results (in simulation and on the Talos platform) demonstrate that, despite the theoretical exponential complexity, optimisations reduce the practical complexity of NAS to a manageable bilinear form, maintaining completeness guarantees and enabling efficient GPU parallelisation. NAS is demonstrated in a variety of scenarios for the Talos robot, both in simulation and on the hardware platform. Future work will focus on further reducing computation times and extending the algorithm's applicability beyond gaited locomotion. Our video is available at \url{https://youtu.be/I5yFe0ez0sI}
comment: Accepted in Humanoids 2024
EqNIO: Subequivariant Neural Inertial Odometry
Neural networks are seeing rapid adoption in purely inertial odometry, where accelerometer and gyroscope measurements from commodity inertial measurement units (IMU) are used to regress displacements and associated uncertainties. They can learn informative displacement priors, which can be directly fused with the raw data with off-the-shelf non-linear filters. Nevertheless, these networks do not consider the physical roto-reflective symmetries inherent in IMU data, leading to the need to memorize the same priors for every possible motion direction, which hinders generalization. In this work, we characterize these symmetries and show that the IMU data and the resulting displacement and covariance transform equivariantly, when rotated around the gravity vector and reflected with respect to arbitrary planes parallel to gravity. We design a neural network that respects these symmetries by design through equivariant processing in three steps: First, it estimates an equivariant gravity-aligned frame from equivariant vectors and invariant scalars derived from IMU data, leveraging expressive linear and non-linear layers tailored to commute with the underlying symmetry transformation. We then map the IMU data into this frame, thereby achieving an invariant canonicalization that can be directly used with off-the-shelf inertial odometry networks. Finally, we map these network outputs back into the original frame, thereby obtaining equivariant covariances and displacements. We demonstrate the generality of our framework by applying it to the filter-based approach based on TLIO, and the end-to-end RONIN architecture, and show better performance on the TLIO, Aria, RIDI and OxIOD datasets than existing methods.
comment: 27 pages
Are Doppler Velocity Measurements Useful for Spinning Radar Odometry?
Spinning, frequency-modulated continuous-wave (FMCW) radars with 360 degree coverage have been gaining popularity for autonomous-vehicle navigation. However, unlike 'fixed' automotive radar, commercially available spinning radar systems typically do not produce radial velocities due to the lack of repeated measurements in the same direction and the fundamental hardware setup. To make these radial velocities observable, we modified the firmware of a commercial spinning radar to use triangular frequency modulation. In this paper, we develop a novel way to use this modulation to extract radial Doppler velocity measurements from consecutive azimuths of a radar intensity scan, without any data association. We show that these noisy, error-prone measurements contain enough information to provide good ego-velocity estimates, and incorporate these estimates into different modern odometry pipelines. We extensively evaluate the pipelines on over 110 km of driving data in progressively more geometrically challenging autonomous-driving environments. We show that Doppler velocity measurements improve odometry in well-defined geometric conditions and enable it to continue functioning even in severely geometrically degenerate environments, such as long tunnels.
comment: 8 pages, 7 figures, 2 tables, submitted to Robotics and Automation Letters (RA-L)
Admissibility Over Winning: A New Approach to Reactive Synthesis in Robotics
Reactive synthesis is a framework for modeling and automatically synthesizing strategies in robotics, typically through computing a \emph{winning} strategy in a 2-player game between the robot and the environment. Winning strategies, however, do not always exist, even in some simple cases. In such situations, it is still desirable for the robot to attempt its task rather than "giving up". In this work, we explore the notion of admissibility to define strategies beyond winning, tailored specifically for robotic systems. We introduce an ordering of admissible strategies and define \emph{admissibly rational strategies}, which aim to be winning and cooperative when possible, and non-violating and hopeful when necessary. We present an efficient synthesis algorithm and demonstrate that admissibly rational strategies produce desirable behaviors through case studies.
comment: Incorrect claims were made in the paper's results section for the Tic-Tac-Toe case study. We are working on corrections, but this will take time
Systems and Control (CS)
Comparing Mass-Preserving Numerical Methods for the Lithium-Ion Battery Single Particle Model
The single particle model (SPM) is a reduced electrochemical model that holds promise for applications in battery management systems due to its ability to accurately capture battery dynamics; however, the numerical discretization of the SPM requires careful consideration to ensure numerical stability and accuracy. In this paper, we present a comparative study of two mass-preserving numerical schemes for the SPM: the finite volume method and the control volume method. Using numerical simulations, we systematically evaluate the performance of these schemes, after independently calibrating the SPM discretized with each scheme to experimental data, and find a tradeoff between accuracy (quantified by voltage root-mean-square error) and computational time. Our findings provide insights into the selection of numerical schemes for the SPM, contributing to the advancement of battery modeling and simulation techniques.
comment: 6 pages, 4 figures
Probabilistically Input-to-State Stable Stochastic Model Predictive Control
Employing model predictive control to systems with unbounded, stochastic disturbances poses the challenge of guaranteeing safety, i.e., repeated feasibility and stability of the closed-loop system. Especially, there are no strict repeated feasibility guarantees for standard stochastic MPC formulations. Thus, traditional stability proofs are not straightforwardly applicable. We exploit the concept of input-to-state stability in probability and outline how it can be used to provide stability guarantees, circumventing the requirement for strict repeated feasibility guarantees. Loss of feasibility is captured by a back-up controller, which is explicitly taken into account in the stability analysis. We illustrate our findings using a numeric example.
comment: Extended version of a manuscript accepted for presentation at CDC 2024
The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies
We study mathematical models of binary direct collinear collisions of convex viscoplastic bodies based on two incremental collision laws that employ the Bouc-Wen differential model of hysteresis to represent the elastoplastic behavior of the materials of the colliding bodies. These collision laws are the Bouc-Wen-Simon-Hunt-Crossley collision law (BWSHCCL) and the Bouc-Wen-Maxwell collision law (BWMCL). The BWSHCCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in parallel to a nonlinear displacement-dependent and rate-dependent energy dissipation element. The BWMCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in series to a linear rate-dependent energy dissipation element. The mathematical models of the collision process are presented in the form of finite-dimensional initial value problems. We show that the models possess favorable analytical properties (e.g., global existence, uniqueness and boundedness of the solutions) under suitable restrictions on the ranges of their parameters. Furthermore, we show that excellent agreement can be achieved between the experimental data and the data from the numerical simulation of the mathematical models across a wide range of initial relative velocities and material properties of the colliding bodies while using parameterizations that are independent of the initial relative velocity.
comment: 15 pages; 4 figures; the associated code/data will be available at https://gitlab.com/user9716869/BWBCL from 11 October 2024
State Feedback System Level Synthesis in Continuous Time
System level synthesis (SLS) is a controller parameterization technique that facilitates distributed structured control via convex techniques. Results on SLS are primarily in the discrete-time setting; this paper extends SLS to the continuous-time setting. We translate the parametrization and associated constraints to continuous time, and propose a controller design procedure consisting of two steps: (1) pole selection and (2) optimization over closed-loops. We provide SLS reformulations of H2 and Hinf control, and show that the proposed procedure allows for convex design of structured H2 and Hinf controllers. We verify our methods in simulation on a grid of linearized swing equations. The resulting structured (i.e. sparse) controllers perform similarly (in some cases within 1\% cost) as the centralized (i.e. dense) controllers. The proposed procedure preserves the scalability and disturbance-rejection features of the original discrete-time SLS framework.
comment: 8 pages, 6 figures, conference
Sensor-Based Safety-Critical Control using an Incremental Control Barrier Function Formulation via Reduced-Order Approximate Models
The existing control barrier function literature generally relies on precise mathematical models to guarantee system safety, limiting their applicability in scenarios with parametric uncertainties. While incremental control techniques have shown promise in addressing model uncertainties in flight control applications, translating these approaches to safety-critical control presents significant challenges. This paper bridges this gap by introducing measurement robust incremental control barrier functions (MRICBFs), which leverage sensor-based reduced-order models to provide formal safety guarantees for uncertain systems. By carefully addressing the challenges of sensor accuracy and approximation errors in the incremental formulation, our approach enables substituting specific model components with real-time sensor measurements while maintaining rigorous safety guarantees. This formulation overcomes the limitations of traditional adaptive control methods that adjust system parameters over time, enabling immediate and reliable safety measures for a particular class of model uncertainties. The efficacy of MRICBFs is demonstrated in two simulation case studies: a simple first-order system with time-varying sensor biases and a more complex overactuated hypersonic glide vehicle with multiple state constraints.
comment: 8 pages, 8 figures, submitted to the American Control Conference 2025
Second-Order Optimization via Quiescence
Second-order optimization methods exhibit fast convergence to critical points, however, in nonconvex optimization, these methods often require restrictive step-sizes to ensure a monotonically decreasing objective function. In the presence of highly nonlinear objective functions with large Lipschitz constants, increasingly small step-sizes become a bottleneck to fast convergence. We propose a second-order optimization method that utilizes a dynamic system model to represent the trajectory of optimization variables as an ODE. We then follow the quasi-steady state trajectory by forcing variables with the fastest rise time into a state known as quiescence. This optimization via quiescence allows us to adaptively select large step-sizes that sequentially follow each optimization variable to a quasi-steady state until all state variables reach the actual steady state, coinciding with the optimum. The result is a second-order method that utilizes large step-sizes and does not require a monotonically decreasing objective function to reach a critical point. Experimentally, we demonstrate the fast convergence of this approach for optimizing nonconvex problems in power systems and compare them to existing state-of-the-art second-order methods, including damped Newton-Raphson, BFGS, and SR1.
Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching
Constrained Reinforcement Learning (CRL) is a subset of machine learning that introduces constraints into the traditional reinforcement learning (RL) framework. Unlike conventional RL which aims solely to maximize cumulative rewards, CRL incorporates additional constraints that represent specific mission requirements or limitations that the agent must comply with during the learning process. In this paper, we address a type of CRL problem where an agent aims to learn the optimal policy to maximize reward while ensuring a desired level of temporal logic constraint satisfaction throughout the learning process. We propose a novel framework that relies on switching between pure learning (reward maximization) and constraint satisfaction. This framework estimates the probability of constraint satisfaction based on earlier trials and properly adjusts the probability of switching between learning and constraint satisfaction policies. We theoretically validate the correctness of the proposed algorithm and demonstrate its performance and scalability through comprehensive simulations.
A four-bodies motorcycle dynamic model for observer design
Motivated by the need to predict dangerous scenarios, this article introduces a non-linear dynamic model for motorcycles consisting of four rigid bodies. Using Jourdain's principle, the model incorporates both longitudinal and lateral dynamics, targeting a balance between numerical complexity and accuracy of representation. The paper further employs the model to design a Luenberger observer based on linear quadratic regulator theory, for estimating physical states based on sensor measurements. In turn, the state estimates are useful for predicting dangerous scenarios (lowside, highside, fall). The relevance of the approach is demonstrated through simulations of various rectilinear trajectories and a lane-changing scenario using BikeSim simulator.
comment: Keywords: motorcycle, modeling, observer, estimation, Jourdain's principle
Eco-driving Incentive Mechanisms for Mitigating Emissions in Urban Transportation
This paper proposes incentive mechanisms that promote eco-driving in transportation networks with the over-arching objective of minimizing emissions. The transportation system operator provides the drivers with energy-efficient driving guidance throughout their trips, and their eco-driving levels are measured by how closely they follow this guidance via vehicle telematics. Drivers choose their eco-driving levels to optimize a combination of their travel times and their emissions. To obtain optimal budget allocation and recommendations for the incentive mechanism, the system operator gathers drivers' preferences, or types, to assess each driver's trip urgency and natural willingness to eco-drive. In a setting where drivers truthfully report their types, we introduce the first-best incentive mechanism and show that the obedience condition holds (i.e., drivers find it optimal to comply with the system operator's recommendations) when the recommended eco-driving profile constitutes a Nash equilibrium. Moreover, in a setting where drivers can strategically report their types, we introduce the second-best incentive mechanism and show that the proposed mechanism is incentive-compatible (i.e., drivers find it optimal to be truthful). Under this mechanism, we also show that all equilibrium outcomes are at least as good as the recommended eco-driving profile in terms of the system operator's objective. Overall, this work offers a framework for designing eco-driving incentive mechanisms while considering both the strategic behavior of individual drivers and the network effects of collective decision-making.
Offline Hierarchical Reinforcement Learning via Inverse Optimization
Hierarchical policies enable strong performance in many sequential decision-making problems, such as those with high-dimensional action spaces, those requiring long-horizon planning, and settings with sparse rewards. However, learning hierarchical policies from static offline datasets presents a significant challenge. Crucially, actions taken by higher-level policies may not be directly observable within hierarchical controllers, and the offline dataset might have been generated using a different policy structure, hindering the use of standard offline learning algorithms. In this work, we propose OHIO: a framework for offline reinforcement learning (RL) of hierarchical policies. Our framework leverages knowledge of the policy structure to solve the inverse problem, recovering the unobservable high-level actions that likely generated the observed data under our hierarchical policy. This approach constructs a dataset suitable for off-the-shelf offline training. We demonstrate our framework on robotic and network optimization problems and show that it substantially outperforms end-to-end RL methods and improves robustness. We investigate a variety of instantiations of our framework, both in direct deployment of policies trained offline and when online fine-tuning is performed.
Robotic framework for autonomous manipulation of laboratory equipment with different degrees of transparency via 6D pose estimation
Many modern robotic systems operate autonomously, however they often lack the ability to accurately analyze the environment and adapt to changing external conditions, while teleoperation systems often require special operator skills. In the field of laboratory automation, the number of automated processes is growing, however such systems are usually developed to perform specific tasks. In addition, many of the objects used in this field are transparent, making it difficult to analyze them using visual channels. The contributions of this work include the development of a robotic framework with autonomous mode for manipulating liquid-filled objects with different degrees of transparency in complex pose combinations. The conducted experiments demonstrated the robustness of the designed visual perception system to accurately estimate object poses for autonomous manipulation, and confirmed the performance of the algorithms in dexterous operations such as liquid dispensing. The proposed robotic framework can be applied for laboratory automation, since it allows solving the problem of performing non-trivial manipulation tasks with the analysis of object poses of varying degrees of transparency and liquid levels, requiring high accuracy and repeatability.
comment: Accepted to the 2024 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024), 8 pages, 11 figures
Reachability Analysis for Black-Box Dynamical Systems
Hamilton-Jacobi (HJ) reachability analysis is a powerful framework for ensuring safety and performance in autonomous systems. However, existing methods typically rely on a white-box dynamics model of the system, limiting their applicability in many practical robotics scenarios where only a black-box model of the system is available. In this work, we propose a novel reachability method to compute reachable sets and safe controllers for black-box dynamical systems. Our approach efficiently approximates the Hamiltonian function using samples from the black-box dynamics. This Hamiltonian is then used to solve the HJ Partial Differential Equation (PDE), providing the reachable set of the system. The proposed method can be applied to general nonlinear systems and can be seamlessly integrated with existing reachability toolboxes for white-box systems to extend their use to black-box systems. Through simulation studies on a black-box slip-wheel car and a quadruped robot, we demonstrate the effectiveness of our approach in accurately obtaining the reachable sets for black?box dynamical systems.
PHODCOS: Pythagorean Hodograph-based Differentiable Coordinate System
This paper presents PHODCOS, an algorithm that assigns a moving coordinate system to a given curve. The parametric functions underlying the coordinate system, i.e., the path function, the moving frame and its angular velocity, are exact -- approximation free -- differentiable, and sufficiently continuous. This allows for computing a coordinate system for highly nonlinear curves, while remaining compliant with autonomous navigation algorithms that require first and second order gradient information. In addition, the coordinate system obtained by PHODCOS is fully defined by a finite number of coefficients, which may then be used to compute additional geometric properties of the curve, such as arc-length, curvature, torsion, etc. Therefore, PHODCOS presents an appealing paradigm to enhance the geometrical awareness of existing guidance and navigation on-orbit spacecraft maneuvers. The PHODCOS algorithm is presented alongside an analysis of its error and approximation order, and thus, it is guaranteed that the obtained coordinate system matches the given curve within a desired tolerance. To demonstrate the applicability of the coordinate system resulting from PHODCOS, we present numerical examples in the Near Rectilinear Halo Orbit (NRHO) for the Lunar Gateway.
comment: Code: https://github.com/jonarriza96/phodcos
The Impact of Grid Storage on Balancing Costs and Carbon Emissions in Great Britain
Grid energy storage can help to balance supply and demand, but its financial viability and operational carbon emissions impact is poorly understood because of the complexity of grid constraints and market outcomes. We analyse the impact of several technologies (Li-ion and flow batteries, pumped hydro, hydrogen) on Great Britain balancing mechanism, the main market for supply-demand balancing and congestion management. We find that, for many locations and technologies, financially optimal operation of storage for balancing can result in higher carbon emissions. For example, the extra emissions associated with a 1 MW 2-hour duration Li-ion battery in winter vary between +230 to -71 kgCO2/h. Although storage enable higher usage of renewables, it can also unlock additional demand leading to greater use of gas. In addition, balancing services alone are presently insufficient for financial viability of storage projects. This work highlights the need for market reform aligning financial incentives with environmental impacts.
A Visual Cooperative Localization Method for Airborne Magnetic Surveying Based on a Manifold Sensor Fusion Algorithm Using Lie Groups
Recent advancements in UAV technology have spurred interest in developing multi-UAV aerial surveying systems for use in confined environments where GNSS signals are blocked or jammed. This paper focuses airborne magnetic surveying scenarios. To obtain clean magnetic measurements reflecting the Earth's magnetic field, the magnetic sensor must be isolated from other electronic devices, creating a significant localization challenge. We propose a visual cooperative localization solution. The solution incorporates a visual processing module and an improved manifold-based sensor fusion algorithm, delivering reliable and accurate positioning information. Real flight experiments validate the approach, demonstrating single-axis centimeter-level accuracy and decimeter-level overall 3D positioning accuracy.
comment: 12 pages
Parallel Digital Twin-driven Deep Reinforcement Learning for User Association and Load Balancing in Dynamic Wireless Networks
Optimization of user association in a densely deployed heterogeneous cellular network is usually challenging and even more complicated due to the dynamic nature of user mobility and fluctuation in user counts. While deep reinforcement learning (DRL) emerges as a promising solution, its application in practice is hindered by high trial-and-error costs in real world and unsatisfactory physical network performance during training. In addition, existing DRL-based user association methods are usually only applicable to scenarios with a fixed number of users due to convergence and compatibility challenges. In this paper, we propose a parallel digital twin (DT)-driven DRL method for user association and load balancing in networks with both dynamic user counts, distribution, and mobility patterns. Our method employs a distributed DRL strategy to handle varying user numbers and exploits a refined neural network structure for faster convergence. To address these DRL training-related challenges, we devise a high-fidelity DT construction technique, featuring a zero-shot generative user mobility model, named Map2Traj, based on a diffusion model. Map2Traj estimates user trajectory patterns and spatial distributions solely from street maps. Armed with this DT environment, DRL agents are enabled to be trained without the need for interactions with the physical network. To enhance the generalization ability of DRL models for dynamic scenarios, a parallel DT framework is further established to alleviate strong correlation and non-stationarity in single-environment training and improve the training efficiency. Numerical results show that the proposed parallel DT-driven DRL method achieves closely comparable performance to real environment training, and even outperforms those trained in a single real-world environment with nearly 20% gain in terms of cell-edge user performance.
comment: arXiv admin note: text overlap with arXiv:2407.19765
Design and Characterization of High Efficiency Single-stage Electromagnetic Coil Guns
This study presents several novel approaches to improve the efficiency of a single-stage coil gun. Conventional designs typically feature a uniformly wound solenoid and a ferrite projectile. For our research, we constructed a microcontroller-based prototype to test several new enhancements, including the use of a bipolar current pulse, a stepped multilayer coil with non-uniform winding densities, and the replacement of conventional ferrite projectiles with a neodymium permanent magnet. These modifications were designed to reduce energy loss and improve projectile acceleration by changing magnetic field strength and effectively controlling the magnetic flux. The experimental results show that the proposed methods resulted in significant efficiency improvements, with the varying current pulse and stepped coil design providing enhanced magnetic force at key points in the projectile's path, and the permanent magnet projectile contributing to higher velocities and efficiencies by leveraging the current pulses. Our findings suggest that combining these enhancements significantly improves coil gun performance, achieving higher velocities and efficiencies. These findings can be applied to future coil gun developments, such as multi-stage coil gun systems.
comment: 10 pages, 23 figures
Enhanced physics-informed neural networks (PINNs) for high-order power grid dynamics NeurIPS 2024
We develop improved physics-informed neural networks (PINNs) for high-order and high-dimensional power system models described by nonlinear ordinary differential equations. We propose some novel enhancements to improve PINN training and accuracy and also implement several other recently proposed ideas from the literature. We successfully apply these to study the transient dynamics of synchronous generators. We also make progress towards applying PINNs to advanced inverter models. Such enhanced PINNs can allow us to accelerate high-fidelity simulations needed to ensure a stable and reliable renewables-rich future grid.
comment: Accepted to the Tackling Climate Change with Machine Learning workshop at NeurIPS 2024
Autonomous Robotic System with Optical Coherence Tomography Guidance for Vascular Anastomosis
Vascular anastomosis, the surgical connection of blood vessels, is essential in procedures such as organ transplants and reconstructive surgeries. The precision required limits accessibility due to the extensive training needed, with manual suturing leading to variable outcomes and revision rates up to 7.9%. Existing robotic systems, while promising, are either fully teleoperated or lack the capabilities necessary for autonomous vascular anastomosis. We present the Micro Smart Tissue Autonomous Robot (micro-STAR), an autonomous robotic system designed to perform vascular anastomosis on small-diameter vessels. The micro-STAR system integrates a novel suturing tool equipped with Optical Coherence Tomography (OCT) fiber-optic sensor and a microcamera, enabling real-time tissue detection and classification. Our system autonomously places sutures and manipulates tissue with minimal human intervention. In an ex vivo study, micro-STAR achieved outcomes competitive with experienced surgeons in terms of leak pressure, lumen reduction, and suture placement variation, completing 90% of sutures without human intervention. This represents the first instance of a robotic system autonomously performing vascular anastomosis on real tissue, offering significant potential for improving surgical precision and expanding access to high-quality care.
comment: This paper was submitted to IEEE TMRB and is currently under review. There are 9 pages, 9 figures, and 2 tables
Safe and Dynamically-Feasible Motion Planning using Control Lyapunov and Barrier Functions
This paper considers the problem of designing motion planning algorithms for control-affine systems that generate collision-free paths from an initial to a final destination and can be executed using safe and dynamically-feasible controllers. We introduce the C-CLF-CBF-RRT algorithm, which produces paths with such properties and leverages rapidly exploring random trees (RRTs), control Lyapunov functions (CLFs) and control barrier functions (CBFs). We show that C-CLF-CBF-RRT is computationally efficient for a variety of different dynamics and obstacles, and establish its probabilistic completeness. We showcase the performance of C-CLF-CBF-RRT in different simulation and hardware experiments.
From Uncertainty to Innovation: Wearable Prototyping with ProtoBot
Despite AI advancements, individuals without software or hardware expertise still face barriers in designing wearable electronic devices due to the lack of code-free prototyping tools. To eliminate these barriers, we designed ProtoBot, leveraging large language models, and conducted a case study with four professionals from different disciplines through playful interaction. The study resulted in four unique wearable device concepts, with participants using Protobot to prototype selected components. From this experience, we learned that (1) uncertainty can be turned into a positive experience, (2) the ProtoBot should transform to reliably act as a guide, and (3) users need to adjust design parameters when interacting with the prototypes. Our work demonstrates, for the first time, the use of large language models in rapid prototyping of wearable electronics. We believe this approach will pioneer rapid prototyping without fear of uncertainties for people who want to develop both wearable prototypes and other products.
comment: 12 pages, 2 figures
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
comment: 16 pages, 17 figures
An Algorithm for Distributed Computation of Reachable Sets for Multi-Agent Systems
In this paper, we consider the problem of distributed reachable set computation for multi-agent systems (MASs) interacting over an undirected, stationary graph. A full state-feedback control input for such MASs depends no only on the current agent's state, but also of its neighbors. However, in most MAS applications, the dynamics are obscured by individual agents. This makes reachable set computation, in a fully distributed manner, a challenging problem. We utilize the ideas of polytopic reachable set approximation and generalize it to a MAS setup. We formulate the resulting sub-problems in a fully distributed manner and provide convergence guarantees for the associated computations. The proposed algorithm's convergence is proved for two cases: static MAS graphs, and time-varying graphs under certain restrictions.
comment: 10 pages, 4 figures, 1 algorithm float. Preprint submitted to ACC 2025 for review
Special Orthogonal Group SO(3), Euler Angles, Angle-axis, Rodriguez Vector and Unit-Quaternion: Overview, Mapping and Challenges
The attitude of a rigid-body in the three dimensional space has a unique and global definition on the Special Orthogonal Group SO (3). This paper gives an overview of the rotation matrix, attitude kinematics and parameterization. The four most frequently used methods of attitude representations are discussed with detailed derivations, namely Euler angles, angle-axis parameterization, Rodriguez vector, and unit-quaternion. The mapping from one representation to others including SO (3) is given. Also, important results which could be useful for the process of filter and/or control design are given. The main weaknesses of attitude parameterization using Euler angles, angle-axis parameterization, Rodriguez vector, and unit-quaternion are illustrated. Keywords: Special Orthogonal Group 3, Euler angles, Angle-axis, Rodriguez Vector, Unit-quaternion, SO(3), Mapping, Parameterization, Attitude, Control, Filter, Observer, Estimator, Rotation, Rotational matrix, Transformation matrix, Orientation, Transformation, Roll, Pitch, Yaw, Quad-rotor, Unmanned aerial vehicle, Robot, spacecraft, satellite, UAV, Underwater vehicle, autonomous, system, Pose, literature review, survey, overview, comparison, comparative study, body frame, identity, origin, dynamics, kinematics, Lie group, inertial frame, zero, filter, control, estimate, observation, measurement, 3D, three dimensional space, advantage, disadvantage.
The computation of approximate feedback Stackelberg equilibria in multi-player nonlinear constrained dynamic games
Solving feedback Stackelberg games with nonlinear dynamics and coupled constraints, a common scenario in practice, presents significant challenges. This work introduces an efficient method for computing approximate local feedback Stackelberg equilibria in multi-player general-sum dynamic games, with continuous state and action spaces. Different from existing (approximate) dynamic programming solutions that are primarily designed for unconstrained problems, our approach involves reformulating a feedback Stackelberg dynamic game into a sequence of nested optimization problems, enabling the derivation of Karush-Kuhn-Tucker (KKT) conditions and the establishment of a second-order sufficient condition for local feedback Stackelberg equilibria. We propose a Newton-style primal-dual interior point method for solving constrained linear quadratic (LQ) feedback Stackelberg games, offering provable convergence guarantees. Our method is further extended to compute local feedback Stackelberg equilibria for more general nonlinear games by iteratively approximating them using LQ games, ensuring that their KKT conditions are locally aligned with those of the original nonlinear games. We prove the exponential convergence of our algorithm in constrained nonlinear games. In a feedback Stackelberg game with nonlinear dynamics and (nonconvex) coupled costs and constraints, our experimental results reveal the algorithm's ability to handle infeasible initial conditions and achieve exponential convergence towards an approximate local feedback Stackelberg equilibrium.
comment: This manuscript has been accepted by SIAM Journal on Optimization
Coordinated Planning for Stability Enhancement in High IBR-Penetrated Systems
Security and stability challenges in future power systems with high penetration Inverter-Based Resources (IBR) have been anticipated as one of the main barriers to decarbonization. Grid-following IBRs may become unstable under small disturbances in weak grids, while during transient processes, system stability and protection may be jeopardized due to the lack of sufficient Short-Circuit Current (SCC). To solve these challenges and achieve decarbonization, the future system has to be carefully planned. However, it remains unclear how both small-signal and transient stabilities can be considered during the system planning stage. In this context, this paper proposes a coordinated planning model of different resources in the transmission system, namely the synchronous condensers and GFM IBRs to enhance system stability. The system strength and SCC constraints are analytically derived by considering the different characteristics of synchronous units and IBRs, which are further effectively linearized through a novel data-driven approach, where an active sampling method is proposed to generate a representative data set. The significant economic value of the proposed coordinated planning framework in both system asset investment and system operation is demonstrated through detailed case studies.
VREM-FL: Mobility-Aware Computation-Scheduling Co-Design for Vehicular Federated Learning
Assisted and autonomous driving are rapidly gaining momentum and will soon become a reality. Artificial intelligence and machine learning are regarded as key enablers thanks to the massive amount of data that smart vehicles will collect from onboard sensors. Federated learning is one of the most promising techniques for training global machine learning models while preserving data privacy of vehicles and optimizing communications resource usage. In this article, we propose vehicular radio environment map federated learning (VREM-FL), a computation-scheduling co-design for vehicular federated learning that combines mobility of vehicles with 5G radio environment maps. VREM-FL jointly optimizes learning performance of the global model and wisely allocates communication and computation resources. This is achieved by orchestrating local computations at the vehicles in conjunction with transmission of their local models in an adaptive and predictive fashion, by exploiting radio channel maps. The proposed algorithm can be tuned to trade training time for radio resource usage. Experimental results demonstrate that VREM-FL outperforms literature benchmarks for both a linear regression model (learning time reduced by 28%) and a deep neural network for semantic image segmentation (doubling the number of model updates within the same time window).
comment: Copyright (c) 2024 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org
Semantic Region Aware Autonomous Exploration for Multi-Type Map Construction in Unknown Indoor Environments
Mainstream autonomous exploration methods usually perform excessively-repeated explorations for the same region, leading to long exploration time and exploration trajectory in complex scenes. To handle this issue, we propose a novel semantic region aware autonomous exploration method, the core idea of which is considering the information of semantic regions to optimize the autonomous navigation strategy. Our method enables the mobile robot to fully explore the current semantic region before moving to the next region, contributing to avoid excessively-repeated explorations and accelerate the exploration speed. In addition, compared with existing au?tonomous exploration methods that usually construct the single-type map, our method allows to construct four types of maps including point cloud map, occupancy grid map, topological map, and semantic map. The experiment results demonstrate that our method achieves the highest 50.7% exploration time reduction and 48.1% exploration trajectory length reduction while maintaining >98% exploration rate when comparing with the classical RRT (Rapid-exploration Random Tree) based autonomous exploration method.
Balancing Application Relevant and Sparsity Revealing Excitation in Input Design
The maximum absolute correlation between regressors, which is called mutual coherence, plays an essential role in sparse estimation. A regressor matrix whose columns are highly correlated may result from optimal input design, since there is no constraint on the mutual coherence, making it difficult to handle sparse estimation. This paper aims to tackle this issue for fixed denominator models, which include Laguerre, Kautz, and generalized orthonormal basis function expansion models, for example. The paper proposes an optimal input design method where the achieved Fisher information matrix is fitted to the desired Fisher matrix, together with a coordinate transformation designed to make the regressors in the transformed coordinates have low mutual coherence. The method can be used together with any sparse estimation method and any desired Fisher matrix. A numerical study shows its potential for alleviating the problem of model order selection when used in conjunction with, for example, classical methods such as the Akaike Information Criterion.
comment: Accepted to the IEEE Transactions on Automatic Control
A Course in Dynamic Optimization
These lecture notes are derived from a graduate-level course in dynamic optimization, offering an introduction to techniques and models extensively used in management science, economics, operations research, engineering, and computer science. The course emphasizes the theoretical underpinnings of discrete-time dynamic programming models and advanced algorithmic strategies for solving these models. Unlike typical treatments, it provides a proof for the principle of optimality for upper semi-continuous dynamic programming, a middle ground between the simpler countable state space case \cite{bertsekas2012dynamic}, and the involved universally measurable case \cite{bertsekas1996stochastic}. This approach is sufficiently rigorous to include important examples such as dynamic pricing, consumption-savings, and inventory management models. The course also delves into the properties of value and policy functions, leveraging classical results \cite{topkis1998supermodularity} and recent developments. Additionally, it offers an introduction to reinforcement learning, including a formal proof of the convergence of Q-learning algorithms. Furthermore, the notes delve into policy gradient methods for the average reward case, presenting a convergence result for the tabular case in this context. This result is simple and similar to the discounted case but appears to be new.
Networked Communication for Decentralised Agents in Mean-Field Games
We introduce networked communication to the mean-field game framework, in particular to oracle-free settings where $N$ decentralised agents learn along a single, non-episodic run of the empirical system. We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases. We provide the order of the difference in these bounds in terms of network structure and number of communication rounds, and also contribute a policy-update stability guarantee. We discuss how the sample guarantees of the three theoretical algorithms do not actually result in practical convergence. We therefore show that in practical settings where the theoretical parameters are not observed (leading to poor estimation of the Q-function), our communication scheme significantly accelerates convergence over the independent case (and sometimes even the centralised case), without relying on the assumption of a centralised learner. We contribute further practical enhancements to all three theoretical algorithms, allowing us to present their first empirical demonstrations. Our experiments confirm that we can remove several of the theoretical assumptions of the algorithms, and display the empirical convergence benefits brought by our new networked communication. We additionally show that the networked approach has significant advantages, over both the centralised and independent alternatives, in terms of robustness to unexpected learning failures and to changes in population size.
A Family of Switching Pursuit Strategies for a Multi-Pursuer Single-Evader Game
This paper introduces a new family of pursuit strategies for multi-pursuer single-evader games in a planar environment. They leverage conditions under which the minimum-time solution of the game becomes equivalent to that of a suitable two-pursuer single-evader game. This enables the design of strategies in which the pursuers first aim to meet such conditions, and then transition to a two-pursuer game once they are satisfied. As a consequence, naive strategies that are in general unsuccessful, can be turned into winning strategies by switching to the appropriate two-pursuer game. Moreover, it is shown via numerical simulations that the switching mechanism significantly enhances the performance of existing pursuit algorithms, like those based on Voronoi partitions.
MPC using mixed-integer programming for aquifer thermal energy storages
Aquifer thermal energy storages (ATES) are used to temporally store thermal energy in groundwater saturated aquifers. Typically, two storages are combined, one for heat and one for cold, to support heating and cooling of buildings. This way, the use of classical fossil fuel-based heating, ventilation, and air conditioning can be significantly reduced. Exploiting the benefits of ATES beyond "seasonal" heating in winter and cooling in summer as well as meeting legislative restrictions requires sophisticated control. We propose a tailored model predictive control (MPC) scheme for the sustainable operation of ATES systems, which mainly builds on a novel model and objective function. The new approach leads to a mixed-integer quadratic program. Its performance is evaluated on real data from an ATES system in Belgium.
On the Feedback Law in Stochastic Optimal Nonlinear Control
We consider the problem of nonlinear stochastic optimal control. This problem is thought to be fundamentally intractable owing to Bellman's "curse of dimensionality". We present a result that shows that repeatedly solving an open-loop deterministic problem from the current state with progressively shorter horizons, similar to Model Predictive Control (MPC), results in a feedback policy that is $O(\epsilon^4)$ near to the true global stochastic optimal policy, where $\epsilon$ is a perturbation parameter modulating the noise. We also show that the optimal deterministic feedback problem has a perturbation structure such that higher-order terms of the feedback law do not affect lower-order terms and that this structure is lost in the optimal stochastic feedback problem. Consequently, solving the Stochastic Dynamic Programming problem is highly susceptible to noise, even in low dimensional problems, and in practice, the MPC-type feedback law offers superior performance even for high noise levels.
comment: arXiv admin note: substantial text overlap with arXiv:2002.10505, arXiv:2002.09478
Nationally Scalable Hydrogen Fueling Infrastructure Deployment: A Megaregion Analysis and Optimization Approach
Decarbonizing regional and long-haul freight faces challenges due to the limitations of battery-electric vehicles and infrastructure. Hydrogen fuel cell medium- and heavy-duty vehicles (MHDVs) present a promising alternative, aligning with the Department of Energy's decarbonization goals. Historically, alternative fuels like compressed natural gas and propane gas have seen slow adoption due to infrastructure barriers. To prevent similar setbacks, planning for zero-emission hydrogen fueling infrastructure is critical. This research develops plans for affordable and accessible hydrogen refueling stations, supporting the decarbonized freight system and benefiting underserved and rural communities by improving air quality, reducing noise pollution, and enhancing energy resilience. It provides a blueprint for replacing diesel in Class 8 trucks with hydrogen fueling solutions, focusing on the Texas Triangle Megaregion (I-45, I-35, I-10), the I-10 corridor between San Antonio, TX, and Los Angeles, CA, and the I-5/CA-99 corridors between Los Angeles and San Francisco. This area accounts for ~8.5% of U.S. heavy-duty freight volume. Using the OR-AGENT (Optimal Regional Architecture Generation for Electrified National Transport) framework, the study analyzes vehicles, freight networks, and energy systems. The framework integrates data on freight mobility, traffic, weather, and energy pathways to deliver optimized powertrain architectures and hydrogen fueling infrastructure deployment. It assesses all vehicle origin-destination pairs and feasible fueling station locations, using a genetic algorithm to identify the minimum number and optimal locations of hydrogen stations. It also determines fuel schedules and quantities, ensuring no vehicle is stranded. A deployment roadmap outlines strategic hydrogen refueling infrastructure rollout across multiple adoption scenarios.
Symbolic Regression on Sparse and Noisy Data with Gaussian Processes
In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our approach GPSINDy offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on simulation data from Lotka-Volterra and unicycle models and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including more than 50% improvement over SINDy and other baselines in predicting future trajectories from noise-corrupted and sparse 5 Hz data.
comment: Submitted to ACC 2025
Hybrid System Stability Analysis of Multi-Lane Mixed-Autonomy Traffic
Autonomous vehicles (AVs) hold vast potential to enhance transportation systems by reducing congestion, improving safety, and lowering emissions. AV controls lead to emergent traffic phenomena; one such intriguing phenomenon is traffic breaks (rolling roadblocks), where a single AV efficiently stabilizes multiple lanes through frequent lane switching, similar to the highway patrolling officers weaving across multiple lanes during difficult traffic conditions. While previous theoretical studies focus on single-lane mixed-autonomy systems, this work proposes a stability analysis framework for multi-lane systems under AV controls. Casting this problem into the hybrid system paradigm, the proposed analysis integrates continuous vehicle dynamics and discrete jumps from AV lane-switches. Through examining the influence of the lane-switch frequency on the system's stability, the analysis offers a principled explanation to the traffic break phenomena, and further discovers opportunities for less-intrusive traffic smoothing by employing less frequent lane-switching. The analysis further facilitates the design of traffic-aware AV lane-switch strategies to enhance system stability. Numerical analysis reveals a strong alignment between the theory and simulation, validating the effectiveness of the proposed stability framework in analyzing multi-lane mixed-autonomy traffic systems.
Practical identification approach for the actuation dynamics of autonomous surface vehicles with minimal instrumentation: extended version
A practical method for identifying the propeller model and inertia matrix of a marine Autonomous Surface Vehicle (ASV) is proposed in this work. Special attention is paid to limiting the instrumentation requirements. Based on a generic grey-box dynamic modelling addressing the considered catamaran-shaped ASV architecture, the static/dynamic behaviour of both propellers and the vessel dynamic are jointly estimated using the sole measurements of position, heading, and propellers pulse width modulation (PWM) signals. No accelerometer is required. Two distinct grey-box configurations involving either a static polynomial or a dynamic modelling of each propeller are proposed and compared. The resulting ASV identification methodology is shown to provide insight into the whole vessel inertial characteristics, which are key enablers in the development of autonomous navigation and control systems. Model validation was performed using data collected from the reported experiments. Model prediction errors related to both linear velocities and yaw rate are evaluated and compared based on given metrics. The results underscore the robustness and accuracy of the identified models in capturing the essential dynamics of the ASV, with a determination coefficient that consistently exceeds 0.94 for all estimated velocities.
Systems and Control (EESS)
Comparing Mass-Preserving Numerical Methods for the Lithium-Ion Battery Single Particle Model
The single particle model (SPM) is a reduced electrochemical model that holds promise for applications in battery management systems due to its ability to accurately capture battery dynamics; however, the numerical discretization of the SPM requires careful consideration to ensure numerical stability and accuracy. In this paper, we present a comparative study of two mass-preserving numerical schemes for the SPM: the finite volume method and the control volume method. Using numerical simulations, we systematically evaluate the performance of these schemes, after independently calibrating the SPM discretized with each scheme to experimental data, and find a tradeoff between accuracy (quantified by voltage root-mean-square error) and computational time. Our findings provide insights into the selection of numerical schemes for the SPM, contributing to the advancement of battery modeling and simulation techniques.
comment: 6 pages, 4 figures
Probabilistically Input-to-State Stable Stochastic Model Predictive Control
Employing model predictive control to systems with unbounded, stochastic disturbances poses the challenge of guaranteeing safety, i.e., repeated feasibility and stability of the closed-loop system. Especially, there are no strict repeated feasibility guarantees for standard stochastic MPC formulations. Thus, traditional stability proofs are not straightforwardly applicable. We exploit the concept of input-to-state stability in probability and outline how it can be used to provide stability guarantees, circumventing the requirement for strict repeated feasibility guarantees. Loss of feasibility is captured by a back-up controller, which is explicitly taken into account in the stability analysis. We illustrate our findings using a numeric example.
comment: Extended version of a manuscript accepted for presentation at CDC 2024
The Bouc-Wen Model for Binary Direct Collinear Collisions of Convex Viscoplastic Bodies
We study mathematical models of binary direct collinear collisions of convex viscoplastic bodies based on two incremental collision laws that employ the Bouc-Wen differential model of hysteresis to represent the elastoplastic behavior of the materials of the colliding bodies. These collision laws are the Bouc-Wen-Simon-Hunt-Crossley collision law (BWSHCCL) and the Bouc-Wen-Maxwell collision law (BWMCL). The BWSHCCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in parallel to a nonlinear displacement-dependent and rate-dependent energy dissipation element. The BWMCL comprises of the Bouc-Wen model amended with the nonlinear Hertzian elastic spring element and connected in series to a linear rate-dependent energy dissipation element. The mathematical models of the collision process are presented in the form of finite-dimensional initial value problems. We show that the models possess favorable analytical properties (e.g., global existence, uniqueness and boundedness of the solutions) under suitable restrictions on the ranges of their parameters. Furthermore, we show that excellent agreement can be achieved between the experimental data and the data from the numerical simulation of the mathematical models across a wide range of initial relative velocities and material properties of the colliding bodies while using parameterizations that are independent of the initial relative velocity.
comment: 15 pages; 4 figures; the associated code/data will be available at https://gitlab.com/user9716869/BWBCL from 11 October 2024
State Feedback System Level Synthesis in Continuous Time
System level synthesis (SLS) is a controller parameterization technique that facilitates distributed structured control via convex techniques. Results on SLS are primarily in the discrete-time setting; this paper extends SLS to the continuous-time setting. We translate the parametrization and associated constraints to continuous time, and propose a controller design procedure consisting of two steps: (1) pole selection and (2) optimization over closed-loops. We provide SLS reformulations of H2 and Hinf control, and show that the proposed procedure allows for convex design of structured H2 and Hinf controllers. We verify our methods in simulation on a grid of linearized swing equations. The resulting structured (i.e. sparse) controllers perform similarly (in some cases within 1\% cost) as the centralized (i.e. dense) controllers. The proposed procedure preserves the scalability and disturbance-rejection features of the original discrete-time SLS framework.
comment: 8 pages, 6 figures, conference
Sensor-Based Safety-Critical Control using an Incremental Control Barrier Function Formulation via Reduced-Order Approximate Models
The existing control barrier function literature generally relies on precise mathematical models to guarantee system safety, limiting their applicability in scenarios with parametric uncertainties. While incremental control techniques have shown promise in addressing model uncertainties in flight control applications, translating these approaches to safety-critical control presents significant challenges. This paper bridges this gap by introducing measurement robust incremental control barrier functions (MRICBFs), which leverage sensor-based reduced-order models to provide formal safety guarantees for uncertain systems. By carefully addressing the challenges of sensor accuracy and approximation errors in the incremental formulation, our approach enables substituting specific model components with real-time sensor measurements while maintaining rigorous safety guarantees. This formulation overcomes the limitations of traditional adaptive control methods that adjust system parameters over time, enabling immediate and reliable safety measures for a particular class of model uncertainties. The efficacy of MRICBFs is demonstrated in two simulation case studies: a simple first-order system with time-varying sensor biases and a more complex overactuated hypersonic glide vehicle with multiple state constraints.
comment: 8 pages, 8 figures, submitted to the American Control Conference 2025
Second-Order Optimization via Quiescence
Second-order optimization methods exhibit fast convergence to critical points, however, in nonconvex optimization, these methods often require restrictive step-sizes to ensure a monotonically decreasing objective function. In the presence of highly nonlinear objective functions with large Lipschitz constants, increasingly small step-sizes become a bottleneck to fast convergence. We propose a second-order optimization method that utilizes a dynamic system model to represent the trajectory of optimization variables as an ODE. We then follow the quasi-steady state trajectory by forcing variables with the fastest rise time into a state known as quiescence. This optimization via quiescence allows us to adaptively select large step-sizes that sequentially follow each optimization variable to a quasi-steady state until all state variables reach the actual steady state, coinciding with the optimum. The result is a second-order method that utilizes large step-sizes and does not require a monotonically decreasing objective function to reach a critical point. Experimentally, we demonstrate the fast convergence of this approach for optimizing nonconvex problems in power systems and compare them to existing state-of-the-art second-order methods, including damped Newton-Raphson, BFGS, and SR1.
Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching
Constrained Reinforcement Learning (CRL) is a subset of machine learning that introduces constraints into the traditional reinforcement learning (RL) framework. Unlike conventional RL which aims solely to maximize cumulative rewards, CRL incorporates additional constraints that represent specific mission requirements or limitations that the agent must comply with during the learning process. In this paper, we address a type of CRL problem where an agent aims to learn the optimal policy to maximize reward while ensuring a desired level of temporal logic constraint satisfaction throughout the learning process. We propose a novel framework that relies on switching between pure learning (reward maximization) and constraint satisfaction. This framework estimates the probability of constraint satisfaction based on earlier trials and properly adjusts the probability of switching between learning and constraint satisfaction policies. We theoretically validate the correctness of the proposed algorithm and demonstrate its performance and scalability through comprehensive simulations.
A four-bodies motorcycle dynamic model for observer design
Motivated by the need to predict dangerous scenarios, this article introduces a non-linear dynamic model for motorcycles consisting of four rigid bodies. Using Jourdain's principle, the model incorporates both longitudinal and lateral dynamics, targeting a balance between numerical complexity and accuracy of representation. The paper further employs the model to design a Luenberger observer based on linear quadratic regulator theory, for estimating physical states based on sensor measurements. In turn, the state estimates are useful for predicting dangerous scenarios (lowside, highside, fall). The relevance of the approach is demonstrated through simulations of various rectilinear trajectories and a lane-changing scenario using BikeSim simulator.
comment: Keywords: motorcycle, modeling, observer, estimation, Jourdain's principle
Eco-driving Incentive Mechanisms for Mitigating Emissions in Urban Transportation
This paper proposes incentive mechanisms that promote eco-driving in transportation networks with the over-arching objective of minimizing emissions. The transportation system operator provides the drivers with energy-efficient driving guidance throughout their trips, and their eco-driving levels are measured by how closely they follow this guidance via vehicle telematics. Drivers choose their eco-driving levels to optimize a combination of their travel times and their emissions. To obtain optimal budget allocation and recommendations for the incentive mechanism, the system operator gathers drivers' preferences, or types, to assess each driver's trip urgency and natural willingness to eco-drive. In a setting where drivers truthfully report their types, we introduce the first-best incentive mechanism and show that the obedience condition holds (i.e., drivers find it optimal to comply with the system operator's recommendations) when the recommended eco-driving profile constitutes a Nash equilibrium. Moreover, in a setting where drivers can strategically report their types, we introduce the second-best incentive mechanism and show that the proposed mechanism is incentive-compatible (i.e., drivers find it optimal to be truthful). Under this mechanism, we also show that all equilibrium outcomes are at least as good as the recommended eco-driving profile in terms of the system operator's objective. Overall, this work offers a framework for designing eco-driving incentive mechanisms while considering both the strategic behavior of individual drivers and the network effects of collective decision-making.
Offline Hierarchical Reinforcement Learning via Inverse Optimization
Hierarchical policies enable strong performance in many sequential decision-making problems, such as those with high-dimensional action spaces, those requiring long-horizon planning, and settings with sparse rewards. However, learning hierarchical policies from static offline datasets presents a significant challenge. Crucially, actions taken by higher-level policies may not be directly observable within hierarchical controllers, and the offline dataset might have been generated using a different policy structure, hindering the use of standard offline learning algorithms. In this work, we propose OHIO: a framework for offline reinforcement learning (RL) of hierarchical policies. Our framework leverages knowledge of the policy structure to solve the inverse problem, recovering the unobservable high-level actions that likely generated the observed data under our hierarchical policy. This approach constructs a dataset suitable for off-the-shelf offline training. We demonstrate our framework on robotic and network optimization problems and show that it substantially outperforms end-to-end RL methods and improves robustness. We investigate a variety of instantiations of our framework, both in direct deployment of policies trained offline and when online fine-tuning is performed.
Robotic framework for autonomous manipulation of laboratory equipment with different degrees of transparency via 6D pose estimation
Many modern robotic systems operate autonomously, however they often lack the ability to accurately analyze the environment and adapt to changing external conditions, while teleoperation systems often require special operator skills. In the field of laboratory automation, the number of automated processes is growing, however such systems are usually developed to perform specific tasks. In addition, many of the objects used in this field are transparent, making it difficult to analyze them using visual channels. The contributions of this work include the development of a robotic framework with autonomous mode for manipulating liquid-filled objects with different degrees of transparency in complex pose combinations. The conducted experiments demonstrated the robustness of the designed visual perception system to accurately estimate object poses for autonomous manipulation, and confirmed the performance of the algorithms in dexterous operations such as liquid dispensing. The proposed robotic framework can be applied for laboratory automation, since it allows solving the problem of performing non-trivial manipulation tasks with the analysis of object poses of varying degrees of transparency and liquid levels, requiring high accuracy and repeatability.
comment: Accepted to the 2024 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024), 8 pages, 11 figures
Reachability Analysis for Black-Box Dynamical Systems
Hamilton-Jacobi (HJ) reachability analysis is a powerful framework for ensuring safety and performance in autonomous systems. However, existing methods typically rely on a white-box dynamics model of the system, limiting their applicability in many practical robotics scenarios where only a black-box model of the system is available. In this work, we propose a novel reachability method to compute reachable sets and safe controllers for black-box dynamical systems. Our approach efficiently approximates the Hamiltonian function using samples from the black-box dynamics. This Hamiltonian is then used to solve the HJ Partial Differential Equation (PDE), providing the reachable set of the system. The proposed method can be applied to general nonlinear systems and can be seamlessly integrated with existing reachability toolboxes for white-box systems to extend their use to black-box systems. Through simulation studies on a black-box slip-wheel car and a quadruped robot, we demonstrate the effectiveness of our approach in accurately obtaining the reachable sets for black?box dynamical systems.
PHODCOS: Pythagorean Hodograph-based Differentiable Coordinate System
This paper presents PHODCOS, an algorithm that assigns a moving coordinate system to a given curve. The parametric functions underlying the coordinate system, i.e., the path function, the moving frame and its angular velocity, are exact -- approximation free -- differentiable, and sufficiently continuous. This allows for computing a coordinate system for highly nonlinear curves, while remaining compliant with autonomous navigation algorithms that require first and second order gradient information. In addition, the coordinate system obtained by PHODCOS is fully defined by a finite number of coefficients, which may then be used to compute additional geometric properties of the curve, such as arc-length, curvature, torsion, etc. Therefore, PHODCOS presents an appealing paradigm to enhance the geometrical awareness of existing guidance and navigation on-orbit spacecraft maneuvers. The PHODCOS algorithm is presented alongside an analysis of its error and approximation order, and thus, it is guaranteed that the obtained coordinate system matches the given curve within a desired tolerance. To demonstrate the applicability of the coordinate system resulting from PHODCOS, we present numerical examples in the Near Rectilinear Halo Orbit (NRHO) for the Lunar Gateway.
comment: Code: https://github.com/jonarriza96/phodcos
The Impact of Grid Storage on Balancing Costs and Carbon Emissions in Great Britain
Grid energy storage can help to balance supply and demand, but its financial viability and operational carbon emissions impact is poorly understood because of the complexity of grid constraints and market outcomes. We analyse the impact of several technologies (Li-ion and flow batteries, pumped hydro, hydrogen) on Great Britain balancing mechanism, the main market for supply-demand balancing and congestion management. We find that, for many locations and technologies, financially optimal operation of storage for balancing can result in higher carbon emissions. For example, the extra emissions associated with a 1 MW 2-hour duration Li-ion battery in winter vary between +230 to -71 kgCO2/h. Although storage enable higher usage of renewables, it can also unlock additional demand leading to greater use of gas. In addition, balancing services alone are presently insufficient for financial viability of storage projects. This work highlights the need for market reform aligning financial incentives with environmental impacts.
A Visual Cooperative Localization Method for Airborne Magnetic Surveying Based on a Manifold Sensor Fusion Algorithm Using Lie Groups
Recent advancements in UAV technology have spurred interest in developing multi-UAV aerial surveying systems for use in confined environments where GNSS signals are blocked or jammed. This paper focuses airborne magnetic surveying scenarios. To obtain clean magnetic measurements reflecting the Earth's magnetic field, the magnetic sensor must be isolated from other electronic devices, creating a significant localization challenge. We propose a visual cooperative localization solution. The solution incorporates a visual processing module and an improved manifold-based sensor fusion algorithm, delivering reliable and accurate positioning information. Real flight experiments validate the approach, demonstrating single-axis centimeter-level accuracy and decimeter-level overall 3D positioning accuracy.
comment: 12 pages
Parallel Digital Twin-driven Deep Reinforcement Learning for User Association and Load Balancing in Dynamic Wireless Networks
Optimization of user association in a densely deployed heterogeneous cellular network is usually challenging and even more complicated due to the dynamic nature of user mobility and fluctuation in user counts. While deep reinforcement learning (DRL) emerges as a promising solution, its application in practice is hindered by high trial-and-error costs in real world and unsatisfactory physical network performance during training. In addition, existing DRL-based user association methods are usually only applicable to scenarios with a fixed number of users due to convergence and compatibility challenges. In this paper, we propose a parallel digital twin (DT)-driven DRL method for user association and load balancing in networks with both dynamic user counts, distribution, and mobility patterns. Our method employs a distributed DRL strategy to handle varying user numbers and exploits a refined neural network structure for faster convergence. To address these DRL training-related challenges, we devise a high-fidelity DT construction technique, featuring a zero-shot generative user mobility model, named Map2Traj, based on a diffusion model. Map2Traj estimates user trajectory patterns and spatial distributions solely from street maps. Armed with this DT environment, DRL agents are enabled to be trained without the need for interactions with the physical network. To enhance the generalization ability of DRL models for dynamic scenarios, a parallel DT framework is further established to alleviate strong correlation and non-stationarity in single-environment training and improve the training efficiency. Numerical results show that the proposed parallel DT-driven DRL method achieves closely comparable performance to real environment training, and even outperforms those trained in a single real-world environment with nearly 20% gain in terms of cell-edge user performance.
comment: arXiv admin note: text overlap with arXiv:2407.19765
Design and Characterization of High Efficiency Single-stage Electromagnetic Coil Guns
This study presents several novel approaches to improve the efficiency of a single-stage coil gun. Conventional designs typically feature a uniformly wound solenoid and a ferrite projectile. For our research, we constructed a microcontroller-based prototype to test several new enhancements, including the use of a bipolar current pulse, a stepped multilayer coil with non-uniform winding densities, and the replacement of conventional ferrite projectiles with a neodymium permanent magnet. These modifications were designed to reduce energy loss and improve projectile acceleration by changing magnetic field strength and effectively controlling the magnetic flux. The experimental results show that the proposed methods resulted in significant efficiency improvements, with the varying current pulse and stepped coil design providing enhanced magnetic force at key points in the projectile's path, and the permanent magnet projectile contributing to higher velocities and efficiencies by leveraging the current pulses. Our findings suggest that combining these enhancements significantly improves coil gun performance, achieving higher velocities and efficiencies. These findings can be applied to future coil gun developments, such as multi-stage coil gun systems.
comment: 10 pages, 23 figures
Enhanced physics-informed neural networks (PINNs) for high-order power grid dynamics NeurIPS 2024
We develop improved physics-informed neural networks (PINNs) for high-order and high-dimensional power system models described by nonlinear ordinary differential equations. We propose some novel enhancements to improve PINN training and accuracy and also implement several other recently proposed ideas from the literature. We successfully apply these to study the transient dynamics of synchronous generators. We also make progress towards applying PINNs to advanced inverter models. Such enhanced PINNs can allow us to accelerate high-fidelity simulations needed to ensure a stable and reliable renewables-rich future grid.
comment: Accepted to the Tackling Climate Change with Machine Learning workshop at NeurIPS 2024
Autonomous Robotic System with Optical Coherence Tomography Guidance for Vascular Anastomosis
Vascular anastomosis, the surgical connection of blood vessels, is essential in procedures such as organ transplants and reconstructive surgeries. The precision required limits accessibility due to the extensive training needed, with manual suturing leading to variable outcomes and revision rates up to 7.9%. Existing robotic systems, while promising, are either fully teleoperated or lack the capabilities necessary for autonomous vascular anastomosis. We present the Micro Smart Tissue Autonomous Robot (micro-STAR), an autonomous robotic system designed to perform vascular anastomosis on small-diameter vessels. The micro-STAR system integrates a novel suturing tool equipped with Optical Coherence Tomography (OCT) fiber-optic sensor and a microcamera, enabling real-time tissue detection and classification. Our system autonomously places sutures and manipulates tissue with minimal human intervention. In an ex vivo study, micro-STAR achieved outcomes competitive with experienced surgeons in terms of leak pressure, lumen reduction, and suture placement variation, completing 90% of sutures without human intervention. This represents the first instance of a robotic system autonomously performing vascular anastomosis on real tissue, offering significant potential for improving surgical precision and expanding access to high-quality care.
comment: This paper was submitted to IEEE TMRB and is currently under review. There are 9 pages, 9 figures, and 2 tables
Safe and Dynamically-Feasible Motion Planning using Control Lyapunov and Barrier Functions
This paper considers the problem of designing motion planning algorithms for control-affine systems that generate collision-free paths from an initial to a final destination and can be executed using safe and dynamically-feasible controllers. We introduce the C-CLF-CBF-RRT algorithm, which produces paths with such properties and leverages rapidly exploring random trees (RRTs), control Lyapunov functions (CLFs) and control barrier functions (CBFs). We show that C-CLF-CBF-RRT is computationally efficient for a variety of different dynamics and obstacles, and establish its probabilistic completeness. We showcase the performance of C-CLF-CBF-RRT in different simulation and hardware experiments.
From Uncertainty to Innovation: Wearable Prototyping with ProtoBot
Despite AI advancements, individuals without software or hardware expertise still face barriers in designing wearable electronic devices due to the lack of code-free prototyping tools. To eliminate these barriers, we designed ProtoBot, leveraging large language models, and conducted a case study with four professionals from different disciplines through playful interaction. The study resulted in four unique wearable device concepts, with participants using Protobot to prototype selected components. From this experience, we learned that (1) uncertainty can be turned into a positive experience, (2) the ProtoBot should transform to reliably act as a guide, and (3) users need to adjust design parameters when interacting with the prototypes. Our work demonstrates, for the first time, the use of large language models in rapid prototyping of wearable electronics. We believe this approach will pioneer rapid prototyping without fear of uncertainties for people who want to develop both wearable prototypes and other products.
comment: 12 pages, 2 figures
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
comment: 16 pages, 17 figures
An Algorithm for Distributed Computation of Reachable Sets for Multi-Agent Systems
In this paper, we consider the problem of distributed reachable set computation for multi-agent systems (MASs) interacting over an undirected, stationary graph. A full state-feedback control input for such MASs depends no only on the current agent's state, but also of its neighbors. However, in most MAS applications, the dynamics are obscured by individual agents. This makes reachable set computation, in a fully distributed manner, a challenging problem. We utilize the ideas of polytopic reachable set approximation and generalize it to a MAS setup. We formulate the resulting sub-problems in a fully distributed manner and provide convergence guarantees for the associated computations. The proposed algorithm's convergence is proved for two cases: static MAS graphs, and time-varying graphs under certain restrictions.
comment: 10 pages, 4 figures, 1 algorithm float. Preprint submitted to ACC 2025 for review
Special Orthogonal Group SO(3), Euler Angles, Angle-axis, Rodriguez Vector and Unit-Quaternion: Overview, Mapping and Challenges
The attitude of a rigid-body in the three dimensional space has a unique and global definition on the Special Orthogonal Group SO (3). This paper gives an overview of the rotation matrix, attitude kinematics and parameterization. The four most frequently used methods of attitude representations are discussed with detailed derivations, namely Euler angles, angle-axis parameterization, Rodriguez vector, and unit-quaternion. The mapping from one representation to others including SO (3) is given. Also, important results which could be useful for the process of filter and/or control design are given. The main weaknesses of attitude parameterization using Euler angles, angle-axis parameterization, Rodriguez vector, and unit-quaternion are illustrated. Keywords: Special Orthogonal Group 3, Euler angles, Angle-axis, Rodriguez Vector, Unit-quaternion, SO(3), Mapping, Parameterization, Attitude, Control, Filter, Observer, Estimator, Rotation, Rotational matrix, Transformation matrix, Orientation, Transformation, Roll, Pitch, Yaw, Quad-rotor, Unmanned aerial vehicle, Robot, spacecraft, satellite, UAV, Underwater vehicle, autonomous, system, Pose, literature review, survey, overview, comparison, comparative study, body frame, identity, origin, dynamics, kinematics, Lie group, inertial frame, zero, filter, control, estimate, observation, measurement, 3D, three dimensional space, advantage, disadvantage.
The computation of approximate feedback Stackelberg equilibria in multi-player nonlinear constrained dynamic games
Solving feedback Stackelberg games with nonlinear dynamics and coupled constraints, a common scenario in practice, presents significant challenges. This work introduces an efficient method for computing approximate local feedback Stackelberg equilibria in multi-player general-sum dynamic games, with continuous state and action spaces. Different from existing (approximate) dynamic programming solutions that are primarily designed for unconstrained problems, our approach involves reformulating a feedback Stackelberg dynamic game into a sequence of nested optimization problems, enabling the derivation of Karush-Kuhn-Tucker (KKT) conditions and the establishment of a second-order sufficient condition for local feedback Stackelberg equilibria. We propose a Newton-style primal-dual interior point method for solving constrained linear quadratic (LQ) feedback Stackelberg games, offering provable convergence guarantees. Our method is further extended to compute local feedback Stackelberg equilibria for more general nonlinear games by iteratively approximating them using LQ games, ensuring that their KKT conditions are locally aligned with those of the original nonlinear games. We prove the exponential convergence of our algorithm in constrained nonlinear games. In a feedback Stackelberg game with nonlinear dynamics and (nonconvex) coupled costs and constraints, our experimental results reveal the algorithm's ability to handle infeasible initial conditions and achieve exponential convergence towards an approximate local feedback Stackelberg equilibrium.
comment: This manuscript has been accepted by SIAM Journal on Optimization
Coordinated Planning for Stability Enhancement in High IBR-Penetrated Systems
Security and stability challenges in future power systems with high penetration Inverter-Based Resources (IBR) have been anticipated as one of the main barriers to decarbonization. Grid-following IBRs may become unstable under small disturbances in weak grids, while during transient processes, system stability and protection may be jeopardized due to the lack of sufficient Short-Circuit Current (SCC). To solve these challenges and achieve decarbonization, the future system has to be carefully planned. However, it remains unclear how both small-signal and transient stabilities can be considered during the system planning stage. In this context, this paper proposes a coordinated planning model of different resources in the transmission system, namely the synchronous condensers and GFM IBRs to enhance system stability. The system strength and SCC constraints are analytically derived by considering the different characteristics of synchronous units and IBRs, which are further effectively linearized through a novel data-driven approach, where an active sampling method is proposed to generate a representative data set. The significant economic value of the proposed coordinated planning framework in both system asset investment and system operation is demonstrated through detailed case studies.
VREM-FL: Mobility-Aware Computation-Scheduling Co-Design for Vehicular Federated Learning
Assisted and autonomous driving are rapidly gaining momentum and will soon become a reality. Artificial intelligence and machine learning are regarded as key enablers thanks to the massive amount of data that smart vehicles will collect from onboard sensors. Federated learning is one of the most promising techniques for training global machine learning models while preserving data privacy of vehicles and optimizing communications resource usage. In this article, we propose vehicular radio environment map federated learning (VREM-FL), a computation-scheduling co-design for vehicular federated learning that combines mobility of vehicles with 5G radio environment maps. VREM-FL jointly optimizes learning performance of the global model and wisely allocates communication and computation resources. This is achieved by orchestrating local computations at the vehicles in conjunction with transmission of their local models in an adaptive and predictive fashion, by exploiting radio channel maps. The proposed algorithm can be tuned to trade training time for radio resource usage. Experimental results demonstrate that VREM-FL outperforms literature benchmarks for both a linear regression model (learning time reduced by 28%) and a deep neural network for semantic image segmentation (doubling the number of model updates within the same time window).
comment: Copyright (c) 2024 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org
Semantic Region Aware Autonomous Exploration for Multi-Type Map Construction in Unknown Indoor Environments
Mainstream autonomous exploration methods usually perform excessively-repeated explorations for the same region, leading to long exploration time and exploration trajectory in complex scenes. To handle this issue, we propose a novel semantic region aware autonomous exploration method, the core idea of which is considering the information of semantic regions to optimize the autonomous navigation strategy. Our method enables the mobile robot to fully explore the current semantic region before moving to the next region, contributing to avoid excessively-repeated explorations and accelerate the exploration speed. In addition, compared with existing au?tonomous exploration methods that usually construct the single-type map, our method allows to construct four types of maps including point cloud map, occupancy grid map, topological map, and semantic map. The experiment results demonstrate that our method achieves the highest 50.7% exploration time reduction and 48.1% exploration trajectory length reduction while maintaining >98% exploration rate when comparing with the classical RRT (Rapid-exploration Random Tree) based autonomous exploration method.
Balancing Application Relevant and Sparsity Revealing Excitation in Input Design
The maximum absolute correlation between regressors, which is called mutual coherence, plays an essential role in sparse estimation. A regressor matrix whose columns are highly correlated may result from optimal input design, since there is no constraint on the mutual coherence, making it difficult to handle sparse estimation. This paper aims to tackle this issue for fixed denominator models, which include Laguerre, Kautz, and generalized orthonormal basis function expansion models, for example. The paper proposes an optimal input design method where the achieved Fisher information matrix is fitted to the desired Fisher matrix, together with a coordinate transformation designed to make the regressors in the transformed coordinates have low mutual coherence. The method can be used together with any sparse estimation method and any desired Fisher matrix. A numerical study shows its potential for alleviating the problem of model order selection when used in conjunction with, for example, classical methods such as the Akaike Information Criterion.
comment: Accepted to the IEEE Transactions on Automatic Control
A Course in Dynamic Optimization
These lecture notes are derived from a graduate-level course in dynamic optimization, offering an introduction to techniques and models extensively used in management science, economics, operations research, engineering, and computer science. The course emphasizes the theoretical underpinnings of discrete-time dynamic programming models and advanced algorithmic strategies for solving these models. Unlike typical treatments, it provides a proof for the principle of optimality for upper semi-continuous dynamic programming, a middle ground between the simpler countable state space case \cite{bertsekas2012dynamic}, and the involved universally measurable case \cite{bertsekas1996stochastic}. This approach is sufficiently rigorous to include important examples such as dynamic pricing, consumption-savings, and inventory management models. The course also delves into the properties of value and policy functions, leveraging classical results \cite{topkis1998supermodularity} and recent developments. Additionally, it offers an introduction to reinforcement learning, including a formal proof of the convergence of Q-learning algorithms. Furthermore, the notes delve into policy gradient methods for the average reward case, presenting a convergence result for the tabular case in this context. This result is simple and similar to the discounted case but appears to be new.
Networked Communication for Decentralised Agents in Mean-Field Games
We introduce networked communication to the mean-field game framework, in particular to oracle-free settings where $N$ decentralised agents learn along a single, non-episodic run of the empirical system. We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases. We provide the order of the difference in these bounds in terms of network structure and number of communication rounds, and also contribute a policy-update stability guarantee. We discuss how the sample guarantees of the three theoretical algorithms do not actually result in practical convergence. We therefore show that in practical settings where the theoretical parameters are not observed (leading to poor estimation of the Q-function), our communication scheme significantly accelerates convergence over the independent case (and sometimes even the centralised case), without relying on the assumption of a centralised learner. We contribute further practical enhancements to all three theoretical algorithms, allowing us to present their first empirical demonstrations. Our experiments confirm that we can remove several of the theoretical assumptions of the algorithms, and display the empirical convergence benefits brought by our new networked communication. We additionally show that the networked approach has significant advantages, over both the centralised and independent alternatives, in terms of robustness to unexpected learning failures and to changes in population size.
A Family of Switching Pursuit Strategies for a Multi-Pursuer Single-Evader Game
This paper introduces a new family of pursuit strategies for multi-pursuer single-evader games in a planar environment. They leverage conditions under which the minimum-time solution of the game becomes equivalent to that of a suitable two-pursuer single-evader game. This enables the design of strategies in which the pursuers first aim to meet such conditions, and then transition to a two-pursuer game once they are satisfied. As a consequence, naive strategies that are in general unsuccessful, can be turned into winning strategies by switching to the appropriate two-pursuer game. Moreover, it is shown via numerical simulations that the switching mechanism significantly enhances the performance of existing pursuit algorithms, like those based on Voronoi partitions.
MPC using mixed-integer programming for aquifer thermal energy storages
Aquifer thermal energy storages (ATES) are used to temporally store thermal energy in groundwater saturated aquifers. Typically, two storages are combined, one for heat and one for cold, to support heating and cooling of buildings. This way, the use of classical fossil fuel-based heating, ventilation, and air conditioning can be significantly reduced. Exploiting the benefits of ATES beyond "seasonal" heating in winter and cooling in summer as well as meeting legislative restrictions requires sophisticated control. We propose a tailored model predictive control (MPC) scheme for the sustainable operation of ATES systems, which mainly builds on a novel model and objective function. The new approach leads to a mixed-integer quadratic program. Its performance is evaluated on real data from an ATES system in Belgium.
On the Feedback Law in Stochastic Optimal Nonlinear Control
We consider the problem of nonlinear stochastic optimal control. This problem is thought to be fundamentally intractable owing to Bellman's "curse of dimensionality". We present a result that shows that repeatedly solving an open-loop deterministic problem from the current state with progressively shorter horizons, similar to Model Predictive Control (MPC), results in a feedback policy that is $O(\epsilon^4)$ near to the true global stochastic optimal policy, where $\epsilon$ is a perturbation parameter modulating the noise. We also show that the optimal deterministic feedback problem has a perturbation structure such that higher-order terms of the feedback law do not affect lower-order terms and that this structure is lost in the optimal stochastic feedback problem. Consequently, solving the Stochastic Dynamic Programming problem is highly susceptible to noise, even in low dimensional problems, and in practice, the MPC-type feedback law offers superior performance even for high noise levels.
comment: arXiv admin note: substantial text overlap with arXiv:2002.10505, arXiv:2002.09478
Nationally Scalable Hydrogen Fueling Infrastructure Deployment: A Megaregion Analysis and Optimization Approach
Decarbonizing regional and long-haul freight faces challenges due to the limitations of battery-electric vehicles and infrastructure. Hydrogen fuel cell medium- and heavy-duty vehicles (MHDVs) present a promising alternative, aligning with the Department of Energy's decarbonization goals. Historically, alternative fuels like compressed natural gas and propane gas have seen slow adoption due to infrastructure barriers. To prevent similar setbacks, planning for zero-emission hydrogen fueling infrastructure is critical. This research develops plans for affordable and accessible hydrogen refueling stations, supporting the decarbonized freight system and benefiting underserved and rural communities by improving air quality, reducing noise pollution, and enhancing energy resilience. It provides a blueprint for replacing diesel in Class 8 trucks with hydrogen fueling solutions, focusing on the Texas Triangle Megaregion (I-45, I-35, I-10), the I-10 corridor between San Antonio, TX, and Los Angeles, CA, and the I-5/CA-99 corridors between Los Angeles and San Francisco. This area accounts for ~8.5% of U.S. heavy-duty freight volume. Using the OR-AGENT (Optimal Regional Architecture Generation for Electrified National Transport) framework, the study analyzes vehicles, freight networks, and energy systems. The framework integrates data on freight mobility, traffic, weather, and energy pathways to deliver optimized powertrain architectures and hydrogen fueling infrastructure deployment. It assesses all vehicle origin-destination pairs and feasible fueling station locations, using a genetic algorithm to identify the minimum number and optimal locations of hydrogen stations. It also determines fuel schedules and quantities, ensuring no vehicle is stranded. A deployment roadmap outlines strategic hydrogen refueling infrastructure rollout across multiple adoption scenarios.
Symbolic Regression on Sparse and Noisy Data with Gaussian Processes
In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our approach GPSINDy offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on simulation data from Lotka-Volterra and unicycle models and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including more than 50% improvement over SINDy and other baselines in predicting future trajectories from noise-corrupted and sparse 5 Hz data.
comment: Submitted to ACC 2025
Hybrid System Stability Analysis of Multi-Lane Mixed-Autonomy Traffic
Autonomous vehicles (AVs) hold vast potential to enhance transportation systems by reducing congestion, improving safety, and lowering emissions. AV controls lead to emergent traffic phenomena; one such intriguing phenomenon is traffic breaks (rolling roadblocks), where a single AV efficiently stabilizes multiple lanes through frequent lane switching, similar to the highway patrolling officers weaving across multiple lanes during difficult traffic conditions. While previous theoretical studies focus on single-lane mixed-autonomy systems, this work proposes a stability analysis framework for multi-lane systems under AV controls. Casting this problem into the hybrid system paradigm, the proposed analysis integrates continuous vehicle dynamics and discrete jumps from AV lane-switches. Through examining the influence of the lane-switch frequency on the system's stability, the analysis offers a principled explanation to the traffic break phenomena, and further discovers opportunities for less-intrusive traffic smoothing by employing less frequent lane-switching. The analysis further facilitates the design of traffic-aware AV lane-switch strategies to enhance system stability. Numerical analysis reveals a strong alignment between the theory and simulation, validating the effectiveness of the proposed stability framework in analyzing multi-lane mixed-autonomy traffic systems.
Practical identification approach for the actuation dynamics of autonomous surface vehicles with minimal instrumentation: extended version
A practical method for identifying the propeller model and inertia matrix of a marine Autonomous Surface Vehicle (ASV) is proposed in this work. Special attention is paid to limiting the instrumentation requirements. Based on a generic grey-box dynamic modelling addressing the considered catamaran-shaped ASV architecture, the static/dynamic behaviour of both propellers and the vessel dynamic are jointly estimated using the sole measurements of position, heading, and propellers pulse width modulation (PWM) signals. No accelerometer is required. Two distinct grey-box configurations involving either a static polynomial or a dynamic modelling of each propeller are proposed and compared. The resulting ASV identification methodology is shown to provide insight into the whole vessel inertial characteristics, which are key enablers in the development of autonomous navigation and control systems. Model validation was performed using data collected from the reported experiments. Model prediction errors related to both linear velocities and yaw rate are evaluated and compared based on given metrics. The results underscore the robustness and accuracy of the identified models in capturing the essential dynamics of the ASV, with a determination coefficient that consistently exceeds 0.94 for all estimated velocities.
Multiagent Systems
Agent-based modeling for realistic reproduction of human mobility and contact behavior to evaluate test and isolation strategies in epidemic infectious disease spread
Agent-based models have proven to be useful tools in supporting decision-making processes in different application domains. The advent of modern computers and supercomputers has enabled these bottom-up approaches to realistically model human mobility and contact behavior. The COVID-19 pandemic showcased the urgent need for detailed and informative models that can answer research questions on transmission dynamics. We present a sophisticated agent-based model to simulate the spread of respiratory diseases. The model is highly modularized and can be used on various scales, from a small collection of buildings up to cities or countries. Although not being the focus of this paper, the model has undergone performance engineering on a single core and provides an efficient intra- and inter-simulation parallelization for time-critical decision-making processes. In order to allow answering research questions on individual level resolution, nonpharmaceutical intervention strategies such as face masks or venue closures can be implemented for particular locations or agents. In particular, we allow for sophisticated testing and isolation strategies to study the effects of minimal-invasive infectious disease mitigation. With realistic human mobility patterns for the region of Brunswick, Germany, we study the effects of different interventions between March 1st and May 30, 2021 in the SARS-CoV-2 pandemic. Our analyses suggest that symptom-independent testing has limited impact on the mitigation of disease dynamics if the dark figure in symptomatic cases is high. Furthermore, we found that quarantine length is more important than quarantine efficiency but that, with sufficient symptomatic control, also short quarantines can have a substantial effect.
comment: 35 pages, 13 figures, to be submitted to Elsevier
Strategic Classification With Externalities
We propose a new variant of the strategic classification problem: a principal reveals a classifier, and $n$ agents report their (possibly manipulated) features to be classified. Motivated by real-world applications, our model crucially allows the manipulation of one agent to affect another; that is, it explicitly captures inter-agent externalities. The principal-agent interactions are formally modeled as a Stackelberg game, with the resulting agent manipulation dynamics captured as a simultaneous game. We show that under certain assumptions, the pure Nash Equilibrium of this agent manipulation game is unique and can be efficiently computed. Leveraging this result, PAC learning guarantees are established for the learner: informally, we show that it is possible to learn classifiers that minimize loss on the distribution, even when a random number of agents are manipulating their way to a pure Nash Equilibrium. We also comment on the optimization of such classifiers through gradient-based approaches. This work sets the theoretical foundations for a more realistic analysis of classifiers that are robust against multiple strategic actors interacting in a common environment.
Dynamic Programming based Local Search approaches for Multi-Agent Path Finding problems on Directed Graphs
Among sub-optimal Multi-Agent Path Finding (MAPF) solvers, rule-based algorithms are particularly appealing since they are complete. Even in crowded scenarios, they allow finding a feasible solution that brings each agent to its target, preventing deadlock situations. However, generally, rule-based algorithms provide much longer solutions than the shortest one. The main contribution of this paper is introducing a new local search procedure for improving a known feasible solution. We start from a feasible sub-optimal solution, and perform a local search in a neighborhood of this solution. If we are able to find a shorter solution, we repeat this procedure until the solution cannot be shortened anymore. At the end, we obtain a solution that is still sub-optimal, but generally of much better quality than the initial one. We propose two different local search policies. In the first, we explore all paths in which the agents positions remain in a neighborhood of the corresponding positions of the reference solution. In the second, we set an upper limit to the number of agents that can change their path with respect to the reference solution. These two different policies can also be alternated. We explore the neighborhoods by dynamic programming. The fact that our search is local is fundamental in terms of time complexity. Indeed, if the dynamic programming approach is applied to the full MAPF problem, the number of explored states grows exponentially with the number of agents. Instead, the introduction of a locality constraint allows exploring the neghborhoods in a time that grows polynomially with respect to the number of agents.
comment: arXiv admin note: text overlap with arXiv:2304.01765
Benchmarking Agentic Workflow Generation
Large Language Models (LLMs), with their exceptional ability to handle a wide range of tasks, have driven significant advancements in tackling reasoning and planning tasks, wherein decomposing complex problems into executable workflows is a crucial step in this process. Existing workflow evaluation frameworks either focus solely on holistic performance or suffer from limitations such as restricted scenario coverage, simplistic workflow structures, and lax evaluation standards. To this end, we introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. Additionally, we present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms to accurately quantify the LLM agent's workflow generation capabilities. Through comprehensive evaluations across different types of LLMs, we discover distinct gaps between the sequence planning capabilities and graph planning capabilities of LLM agents, with even GPT-4 exhibiting a gap of around 15%. We also train two open-source models and evaluate their generalization abilities on held-out tasks. Furthermore, we observe that the generated workflows can enhance downstream tasks, enabling them to achieve superior performance with less time during inference. Code and dataset will be available at https://github.com/zjunlp/WorFBench.
comment: Work in progress
A Hate Speech Moderated Chat Application: Use Case for GDPR and DSA Compliance
The detection of hate speech or toxic content online is a complex and sensitive issue. While the identification itself is highly dependent on the context of the situation, sensitive personal attributes such as age, language, and nationality are rarely available due to privacy concerns. Additionally, platforms struggle with a wide range of local jurisdictions regarding online hate speech and the evaluation of content based on their internal ethical norms. This research presents a novel approach that demonstrates a GDPR-compliant application capable of implementing legal and ethical reasoning into the content moderation process. The application increases the explainability of moderation decisions by utilizing user information. Two use cases fundamental to online communication are presented and implemented using technologies such as GPT-3.5, Solid Pods, and the rule language Prova. The first use case demonstrates the scenario of a platform aiming to protect adolescents from potentially harmful content by limiting the ability to post certain content when minors are present. The second use case aims to identify and counter problematic statements online by providing counter hate speech. The counter hate speech is generated using personal attributes to appeal to the user. This research lays the groundwork for future DSA compliance of online platforms. The work proposes a novel approach to reason within different legal and ethical definitions of hate speech and plan the fitting counter hate speech. Overall, the platform provides a fitted protection to users and a more explainable and individualized response. The hate speech detection service, the chat platform, and the reasoning in Prova are discussed, and the potential benefits for content moderation and algorithmic hate speech detection are outlined. A selection of important aspects for DSA compliance is outlined.
CE-MRS: Contrastive Explanations for Multi-Robot Systems
As the complexity of multi-robot systems grows to incorporate a greater number of robots, more complex tasks, and longer time horizons, the solutions to such problems often become too complex to be fully intelligible to human users. In this work, we introduce an approach for generating natural language explanations that justify the validity of the system's solution to the user, or else aid the user in correcting any errors that led to a suboptimal system solution. Toward this goal, we first contribute a generalizable formalism of contrastive explanations for multi-robot systems, and then introduce a holistic approach to generating contrastive explanations for multi-robot scenarios that selectively incorporates data from multi-robot task allocation, scheduling, and motion-planning to explain system behavior. Through user studies with human operators we demonstrate that our integrated contrastive explanation approach leads to significant improvements in user ability to identify and solve system errors, leading to significant improvements in overall multi-robot team performance.
comment: Accepted to IEEE Robotics and Automation Letters
Exploring Natural Language-Based Strategies for Efficient Number Learning in Children through Reinforcement Learning
This paper investigates how children learn numbers using the framework of reinforcement learning (RL), with a focus on the impact of language instructions. The motivation for using reinforcement learning stems from its parallels with psychological learning theories in controlled environments. By using state of the art deep reinforcement learning models, we simulate and analyze the effects of various forms of language instructions on number acquisition. Our findings indicate that certain linguistic structures more effectively improve numerical comprehension in RL agents. Additionally, our model predicts optimal sequences for presenting numbers to RL agents which enhance their speed of learning. This research provides valuable insights into the interplay between language and numerical cognition, with implications for both educational strategies and the development of artificial intelligence systems designed to support early childhood learning.
Networked Communication for Decentralised Agents in Mean-Field Games
We introduce networked communication to the mean-field game framework, in particular to oracle-free settings where $N$ decentralised agents learn along a single, non-episodic run of the empirical system. We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases. We provide the order of the difference in these bounds in terms of network structure and number of communication rounds, and also contribute a policy-update stability guarantee. We discuss how the sample guarantees of the three theoretical algorithms do not actually result in practical convergence. We therefore show that in practical settings where the theoretical parameters are not observed (leading to poor estimation of the Q-function), our communication scheme significantly accelerates convergence over the independent case (and sometimes even the centralised case), without relying on the assumption of a centralised learner. We contribute further practical enhancements to all three theoretical algorithms, allowing us to present their first empirical demonstrations. Our experiments confirm that we can remove several of the theoretical assumptions of the algorithms, and display the empirical convergence benefits brought by our new networked communication. We additionally show that the networked approach has significant advantages, over both the centralised and independent alternatives, in terms of robustness to unexpected learning failures and to changes in population size.
Active Scout: Multi-Target Tracking Using Neural Radiance Fields in Dense Urban Environments IROS
We study pursuit-evasion games in highly occluded urban environments, e.g. tall buildings in a city, where a scout (quadrotor) tracks multiple dynamic targets on the ground. We show that we can build a neural radiance field (NeRF) representation of the city -- online -- using RGB and depth images from different vantage points. This representation is used to calculate the information gain to both explore unknown parts of the city and track the targets -- thereby giving a completely first-principles approach to actively tracking dynamic targets. We demonstrate, using a custom-built simulator using Open Street Maps data of Philadelphia and New York City, that we can explore and locate 20 stationary targets within 300 steps. This is slower than a greedy baseline, which does not use active perception. But for dynamic targets that actively hide behind occlusions, we show that our approach maintains, at worst, a tracking error of 200m; the greedy baseline can have a tracking error as large as 600m. We observe a number of interesting properties in the scout's policies, e.g., it switches its attention to track a different target periodically, as the quality of the NeRF representation improves over time, the scout also becomes better in terms of target tracking.
comment: 9 pages, 10 figures, 2 tables, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024
Robotics
Neural Circuit Architectural Priors for Quadruped Locomotion
Learning-based approaches to quadruped locomotion commonly adopt generic policy architectures like fully connected MLPs. As such architectures contain few inductive biases, it is common in practice to incorporate priors in the form of rewards, training curricula, imitation data, or trajectory generators. In nature, animals are born with priors in the form of their nervous system's architecture, which has been shaped by evolution to confer innate ability and efficient learning. For instance, a horse can walk within hours of birth and can quickly improve with practice. Such architectural priors can also be useful in ANN architectures for AI. In this work, we explore the advantages of a biologically inspired ANN architecture for quadruped locomotion based on neural circuits in the limbs and spinal cord of mammals. Our architecture achieves good initial performance and comparable final performance to MLPs, while using less data and orders of magnitude fewer parameters. Our architecture also exhibits better generalization to task variations, even admitting deployment on a physical robot without standard sim-to-real methods. This work shows that neural circuits can provide valuable architectural priors for locomotion and encourages future work in other sensorimotor skills.
VIRT: Vision Instructed Transformer for Robotic Manipulation
Robotic manipulation, owing to its multi-modal nature, often faces significant training ambiguity, necessitating explicit instructions to clearly delineate the manipulation details in tasks. In this work, we highlight that vision instruction is naturally more comprehensible to recent robotic policies than the commonly adopted text instruction, as these policies are born with some vision understanding ability like human infants. Building on this premise and drawing inspiration from cognitive science, we introduce the robotic imagery paradigm, which realizes large-scale robotic data pre-training without text annotations. Additionally, we propose the robotic gaze strategy that emulates the human eye gaze mechanism, thereby guiding subsequent actions and focusing the attention of the policy on the manipulated object. Leveraging these innovations, we develop VIRT, a fully Transformer-based policy. We design comprehensive tasks using both a physical robot and simulated environments to assess the efficacy of VIRT. The results indicate that VIRT can complete very competitive tasks like ``opening the lid of a tightly sealed bottle'', and the proposed techniques boost the success rates of the baseline policy on diverse challenging tasks from nearly 0% to more than 65%.
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making NeurIPS 2024
We aim to evaluate Large Language Models (LLMs) for embodied decision making. While a significant body of work has been leveraging LLMs for decision making in embodied environments, we still lack a systematic understanding of their performance because they are usually applied in different domains, for different purposes, and built based on different inputs and outputs. Furthermore, existing evaluations tend to rely solely on a final success rate, making it difficult to pinpoint what ability is missing in LLMs and where the problem lies, which in turn blocks embodied agents from leveraging LLMs effectively and selectively. To address these limitations, we propose a generalized interface (Embodied Agent Interface) that supports the formalization of various types of tasks and input-output specifications of LLM-based modules. Specifically, it allows us to unify 1) a broad set of embodied decision-making tasks involving both state and temporally extended goals, 2) four commonly-used LLM-based modules for decision making: goal interpretation, subgoal decomposition, action sequencing, and transition modeling, and 3) a collection of fine-grained metrics which break down evaluation into various types of errors, such as hallucination errors, affordance errors, various types of planning errors, etc. Overall, our benchmark offers a comprehensive assessment of LLMs' performance for different subtasks, pinpointing the strengths and weaknesses in LLM-powered embodied AI systems, and providing insights for effective and selective use of LLMs in embodied decision making.
comment: Accepted for oral presentation at NeurIPS 2024 in the Datasets and Benchmarks track
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology ICLR 2025
Developing agents capable of navigating to a target location based on language instructions and visual information, known as vision-language navigation (VLN), has attracted widespread interest. Most research has focused on ground-based agents, while UAV-based VLN remains relatively underexplored. Recent efforts in UAV vision-language navigation predominantly adopt ground-based VLN settings, relying on predefined discrete action spaces and neglecting the inherent disparities in agent movement dynamics and the complexity of navigation tasks between ground and aerial environments. To address these disparities and challenges, we propose solutions from three perspectives: platform, benchmark, and methodology. To enable realistic UAV trajectory simulation in VLN tasks, we propose the OpenUAV platform, which features diverse environments, realistic flight control, and extensive algorithmic support. We further construct a target-oriented VLN dataset consisting of approximately 12k trajectories on this platform, serving as the first dataset specifically designed for realistic UAV VLN tasks. To tackle the challenges posed by complex aerial environments, we propose an assistant-guided UAV object search benchmark called UAV-Need-Help, which provides varying levels of guidance information to help UAVs better accomplish realistic VLN tasks. We also propose a UAV navigation LLM that, given multi-view images, task descriptions, and assistant instructions, leverages the multimodal understanding capabilities of the MLLM to jointly process visual and textual information, and performs hierarchical trajectory generation. The evaluation results of our method significantly outperform the baseline models, while there remains a considerable gap between our results and those achieved by human operators, underscoring the challenge presented by the UAV-Need-Help task.
comment: Under review as a conference paper at ICLR 2025
FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation
We introduce a novel approach to manipulate articulated objects with ambiguities, such as opening a door, in which multi-modality and occlusions create ambiguities about the opening side and direction. Multi-modality occurs when the method to open a fully closed door (push, pull, slide) is uncertain, or the side from which it should be opened is uncertain. Occlusions further obscure the door's shape from certain angles, creating further ambiguities during the occlusion. To tackle these challenges, we propose a history-aware diffusion network that models the multi-modal distribution of the articulated object and uses history to disambiguate actions and make stable predictions under occlusions. Experiments and analysis demonstrate the state-of-art performance of our method and specifically improvements in ambiguity-caused failure modes. Our project website is available at https://flowbothd.github.io/.
comment: Accepted to CoRL 2024
RM4D: A Combined Reachability and Inverse Reachability Map for Common 6-/7-axis Robot Arms by Dimensionality Reduction to 4D ICRA 2025
Knowledge of a manipulator's workspace is fundamental for a variety of tasks including robot design, grasp planning and robot base placement. Consequently, workspace representations are well studied in robotics. Two important representations are reachability maps and inverse reachability maps. The former predicts whether a given end-effector pose is reachable from where the robot currently is, and the latter suggests suitable base positions for a desired end-effector pose. Typically, the reachability map is built by discretizing the 6D space containing the robot's workspace and determining, for each cell, whether it is reachable or not. The reachability map is subsequently inverted to build the inverse map. This is a cumbersome process which restricts the applications of such maps. In this work, we exploit commonalities of existing six and seven axis robot arms to reduce the dimension of the discretization from 6D to 4D. We propose Reachability Map 4D (RM4D), a map that only requires a single 4D data structure for both forward and inverse queries. This gives a much more compact map that can be constructed by an order of magnitude faster than existing maps, with no inversion overheads and no loss in accuracy. Our experiments showcase the usefulness of RM4D for grasp planning with a mobile manipulator.
comment: Submitted to ICRA 2025. See project page: https://mrudorfer.github.io/rm4d/
Control System Design and Experiments for Autonomous Underwater Helicopter Docking Procedure Based on Acoustic-inertial-optical Guidance
A control system structure for the underwater docking procedure of an Autonomous Underwater Helicopter (AUH) is proposed in this paper, which utilizes acoustic-inertial-optical guidance. Unlike conventional Autonomous Underwater Vehicles (AUVs), the maneuverability requirements for AUHs are more stringent during the docking procedure, requiring it to remain stationary or have minimal horizontal movement while moving vertically. The docking procedure is divided into two stages: Homing and Landing, each stage utilizing different guidance methods. Additionally, a segmented aligning strategy operating at various altitudes and a linear velocity decision are both adopted in Landing stage. Due to the unique structure of the Subsea Docking System (SDS), the AUH is required to dock onto the SDS in a fixed orientation with specific attitude and altitude. Therefore, a particular criterion is proposed to determine whether the AUH has successfully docked onto the SDS. Furthermore, the effectiveness and robustness of the proposed control method in AUH's docking procedure are demonstrated through pool experiments and sea trials.
Combining Planning and Diffusion for Mobility with Unknown Dynamics ICRA 2025
Manipulation of large objects over long horizons (such as carts in a warehouse) is an essential skill for deployable robotic systems. Large objects require mobile manipulation which involves simultaneous manipulation, navigation, and movement with the object in tow. In many real-world situations, object dynamics are incredibly complex, such as the interaction of an office chair (with a rotating base and five caster wheels) and the ground. We present a hierarchical algorithm for long-horizon robot manipulation problems in which the dynamics are partially unknown. We observe that diffusion-based behavior cloning is highly effective for short-horizon problems with unknown dynamics, so we decompose the problem into an abstract high-level, obstacle-aware motion-planning problem that produces a waypoint sequence. We use a short-horizon, relative-motion diffusion policy to achieve the waypoints in sequence. We train mobile manipulation policies on a Spot robot that has to push and pull an office chair. Our hierarchical manipulation policy performs consistently better, especially when the horizon increases, compared to a diffusion policy trained on long-horizon demonstrations or motion planning assuming a rigidly-attached object (success rate of 8 (versus 0 and 5 respectively) out of 10 runs). Importantly, our learned policy generalizes to new layouts, grasps, chairs, and flooring that induces more friction, without any further training, showing promise for other complex mobile manipulation problems. Project Page: https://yravan.github.io/plannerorderedpolicy/
comment: Submitted to ICRA 2025
Safe Reinforcement Learning Filter for Multicopter Collision-Free Tracking under disturbances
This paper proposes a safe reinforcement learning filter (SRLF) to realize multicopter collision-free trajectory tracking with input disturbance. A novel robust control barrier function (RCBF) with its analysis techniques is introduced to avoid collisions with unknown disturbances during tracking. To ensure the system state remains within the safe set, the RCBF gain is designed in control action. A safety filter is introduced to transform unsafe reinforcement learning (RL) control inputs into safe ones, allowing RL training to proceed without explicitly considering safety constraints. The SRLF obtains rigorous guaranteed safe control action by solving a quadratic programming (QP) problem that incorporates forward invariance of RCBF and input saturation constraints. Both simulation and real-world experiments on multicopters demonstrate the effectiveness and excellent performance of SRLF in achieving collision-free tracking under input disturbances and saturation.
A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement Learning and Application in UAV Hovering
This paper proposes a safety modulator actor-critic (SMAC) method to address safety constraint and overestimation mitigation in model-free safe reinforcement learning (RL). A safety modulator is developed to satisfy safety constraints by modulating actions, allowing the policy to ignore safety constraint and focus on maximizing reward. Additionally, a distributional critic with a theoretical update rule for SMAC is proposed to mitigate the overestimation of Q-values with safety constraints. Both simulation and real-world scenarios experiments on Unmanned Aerial Vehicles (UAVs) hovering confirm that the SMAC can effectively maintain safety constraints and outperform mainstream baseline algorithms.
Dynamic Neural Potential Field: Online Trajectory Optimization in Presence of Moving Obstacles
We address a task of local trajectory planning for the mobile robot in the presence of static and dynamic obstacles. Local trajectory is obtained as a numerical solution of the Model Predictive Control (MPC) problem. Collision avoidance may be provided by adding repulsive potential of the obstacles to the cost function of MPC. We develop an approach, where repulsive potential is estimated by the neural model. We propose and explore three possible strategies of handling dynamic obstacles. First, environment with dynamic obstacles is considered as a sequence of static environments. Second, the neural model predict a sequence of repulsive potential at once. Third, the neural model predict future repulsive potential step by step in autoregressive mode. We implement these strategies and compare it with CIAO* and MPPI using BenchMR framework. First two strategies showed higher performance than CIAO* and MPPI while preserving safety constraints. The third strategy was a bit slower, however it still satisfy time limits. We deploy our approach on Husky UGV mobile platform, which move through the office corridors under proposed MPC local trajectory planner. The code and trained models are available at \url{https://github.com/CognitiveAISystems/Dynamic-Neural-Potential-Field}.
Discrete time model predictive control for humanoid walking with step adjustment
This paper presents a Discrete-Time Model Predictive Controller (MPC) for humanoid walking with online footstep adjustment. The proposed controller utilizes a hierarchical control approach. The high-level controller uses a low-dimensional Linear Inverted Pendulum Model (LIPM) to determine desired foot placement and Center of Mass (CoM) motion, to prevent falls while maintaining the desired velocity. A Task Space Controller (TSC) then tracks the desired motion obtained from the high-level controller, exploiting the whole-body dynamics of the humanoid. Our approach differs from existing MPC methods for walking pattern generation by not relying on a predefined foot-plan or a reference center of pressure (CoP) trajectory. The overall approach is tested in simulation on a torque-controlled Humanoid Robot. Results show that proposed control approach generates stable walking and prevents fall against push disturbances.
comment: 6 pages, 17 figures, 1 table
Collective perception for tracking people with a robot swarm ICRA
Swarm perception refers to the ability of a robot swarm to utilize the perception capabilities of each individual robot, forming a collective understanding of the environment. Their distributed nature enables robot swarms to continuously monitor dynamic environments by maintaining a constant presence throughout the space.In this study, we present a preliminary experiment on the collective tracking of people using a robot swarm. The experiment was conducted in simulation across four different office environments, with swarms of varying sizes. The robots were provided with images sampled from a dataset of real-world office environment pictures.We measured the time distribution required for a robot to detect a person changing location and to propagate this information to increasing fractions of the swarm. The results indicate that robot swarms show significant promise in monitoring dynamic environments.
comment: Presented at ICRA@40, Rotterdam
OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB
To address the challenge of short-term object pose tracking in dynamic environments with monocular RGB input, we introduce a large-scale synthetic dataset OmniPose6D, crafted to mirror the diversity of real-world conditions. We additionally present a benchmarking framework for a comprehensive comparison of pose tracking algorithms. We propose a pipeline featuring an uncertainty-aware keypoint refinement network, employing probabilistic modeling to refine pose estimation. Comparative evaluations demonstrate that our approach achieves performance superior to existing baselines on real datasets, underscoring the effectiveness of our synthetic dataset and refinement technique in enhancing tracking precision in dynamic contexts. Our contributions set a new precedent for the development and assessment of object pose tracking methodologies in complex scenes.
comment: 13 pages, 9 figures
Autonomous localization of multiple ionizing radiation sources using miniature single-layer Compton cameras onboard a group of micro aerial vehicles IROS
A novel method for autonomous localization of multiple sources of gamma radiation using a group of Micro Aerial Vehicles (MAVs) is presented in this paper. The method utilizes an extremely lightweight (44 g) Compton camera MiniPIX TPX3. The compact size of the detector allows for deployment onboard safe and agile small-scale Unmanned Aerial Vehicles (UAVs). The proposed radiation mapping approach fuses measurements from multiple distributed Compton camera sensors to accurately estimate the positions of multiple radioactive sources in real time. Unlike commonly used intensity-based detectors, the Compton camera reconstructs the set of possible directions towards a radiation source from just a single ionizing particle. Therefore, the proposed approach can localize radiation sources without having to estimate the gradient of a radiation field or contour lines, which require longer measurements. The instant estimation is able to fully exploit the potential of highly mobile MAVs. The radiation mapping method is combined with an active search strategy, which coordinates the future actions of the MAVs in order to improve the quality of the estimate of the sources' positions, as well as to explore the area of interest faster. The proposed solution is evaluated in simulation and real world experiments with multiple Cesium-137 radiation sources.
comment: International Conference on Intelligent Robots and Systems (IROS) 2024
M${}^{3}$Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes
We propose M^3Bench, a new benchmark for whole-body motion generation for mobile manipulation tasks. Given a 3D scene context, M^3Bench requires an embodied agent to understand its configuration, environmental constraints and task objectives, then generate coordinated whole-body motion trajectories for object rearrangement tasks. M^3Bench features 30k object rearrangement tasks across 119 diverse scenes, providing expert demonstrations generated by our newly developed M^3BenchMaker. This automatic data generation tool produces coordinated whole-body motion trajectories from high-level task instructions, requiring only basic scene and robot information. Our benchmark incorporates various task splits to assess generalization across different dimensions and leverages realistic physics simulation for trajectory evaluation. Through extensive experimental analyses, we reveal that state-of-the-art models still struggle with coordinated base-arm motion while adhering to environment-context and task-specific constraints, highlighting the need to develop new models that address this gap. Through M^3Bench, we aim to facilitate future robotics research towards more adaptive and capable mobile manipulation in diverse, real-world environments.
Task Coordination and Trajectory Optimization for Multi-Aerial Systems via Signal Temporal Logic: A Wind Turbine Inspection Study IROS'24
This paper presents a method for task allocation and trajectory generation in cooperative inspection missions using a fleet of multirotor drones, with a focus on wind turbine inspection. The approach generates safe, feasible flight paths that adhere to time-sensitive constraints and vehicle limitations by formulating an optimization problem based on Signal Temporal Logic (STL) specifications. An event-triggered replanning mechanism addresses unexpected events and delays, while a generalized robustness scoring method incorporates user preferences and minimizes task conflicts. The approach is validated through simulations in MATLAB and Gazebo, as well as field experiments in a mock-up scenario.
comment: 2 pages, Accepted for discussion at the workshop session "Formal methods techniques in robotics systems: Design and control" at IROS'24 in Abu Dhabi, UAE
Pair-VPR: Place-Aware Pre-training and Contrastive Pair Classification for Visual Place Recognition with Vision Transformers
In this work we propose a novel joint training method for Visual Place Recognition (VPR), which simultaneously learns a global descriptor and a pair classifier for re-ranking. The pair classifier can predict whether a given pair of images are from the same place or not. The network only comprises Vision Transformer components for both the encoder and the pair classifier, and both components are trained using their respective class tokens. In existing VPR methods, typically the network is initialized using pre-trained weights from a generic image dataset such as ImageNet. In this work we propose an alternative pre-training strategy, by using Siamese Masked Image Modelling as a pre-training task. We propose a Place-aware image sampling procedure from a collection of large VPR datasets for pre-training our model, to learn visual features tuned specifically for VPR. By re-using the Mask Image Modelling encoder and decoder weights in the second stage of training, Pair-VPR can achieve state-of-the-art VPR performance across five benchmark datasets with a ViT-B encoder, along with further improvements in localization recall with larger encoders. The Pair-VPR website is: https://csiro-robotics.github.io/Pair-VPR.
ES-Gaussian: Gaussian Splatting Mapping via Error Space-Based Gaussian Completion
Accurate and affordable indoor 3D reconstruction is critical for effective robot navigation and interaction. Traditional LiDAR-based mapping provides high precision but is costly, heavy, and power-intensive, with limited ability for novel view rendering. Vision-based mapping, while cost-effective and capable of capturing visual data, often struggles with high-quality 3D reconstruction due to sparse point clouds. We propose ES-Gaussian, an end-to-end system using a low-altitude camera and single-line LiDAR for high-quality 3D indoor reconstruction. Our system features Visual Error Construction (VEC) to enhance sparse point clouds by identifying and correcting areas with insufficient geometric detail from 2D error maps. Additionally, we introduce a novel 3DGS initialization method guided by single-line LiDAR, overcoming the limitations of traditional multi-view setups and enabling effective reconstruction in resource-constrained environments. Extensive experimental results on our new Dreame-SR dataset and a publicly available dataset demonstrate that ES-Gaussian outperforms existing methods, particularly in challenging scenarios. The project page is available at https://chenlu-china.github.io/ES-Gaussian/.
comment: Project page: https://chenlu-china.github.io/ES-Gaussian/
Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning
Reinforcement learning (RL) agents need to explore their environment to learn optimal behaviors and achieve maximum rewards. However, exploration can be risky when training RL directly on real systems, while simulation-based training introduces the tricky issue of the sim-to-real gap. Recent approaches have leveraged safety filters, such as control barrier functions (CBFs), to penalize unsafe actions during RL training. However, the strong safety guarantees of CBFs rely on a precise dynamic model. In practice, uncertainties always exist, including internal disturbances from the errors of dynamics and external disturbances such as wind. In this work, we propose a new safe RL framework based on disturbance rejection-guarded learning, which allows for an almost model-free RL with an assumed but not necessarily precise nominal dynamic model. We demonstrate our results on the Safety-gym benchmark for Point and Car robots on all tasks where we can outperform state-of-the-art approaches that use only residual model learning or a disturbance observer (DOB). We further validate the efficacy of our framework using a physical F1/10 racing car. Videos: https://sites.google.com/view/res-dob-cbf-rl
Agile Mobility with Rapid Online Adaptation via Meta-learning and Uncertainty-aware MPPI
Modern non-linear model-based controllers require an accurate physics model and model parameters to be able to control mobile robots at their limits. Also, due to surface slipping at high speeds, the friction parameters may continually change (like tire degradation in autonomous racing), and the controller may need to adapt rapidly. Many works derive a task-specific robot model with a parameter adaptation scheme that works well for the task but requires a lot of effort and tuning for each platform and task. In this work, we design a full model-learning-based controller based on meta pre-training that can very quickly adapt using few-shot dynamics data to any wheel-based robot with any model parameters, while also reasoning about model uncertainty. We demonstrate our results in small-scale numeric simulation, the large-scale Unity simulator, and on a medium-scale hardware platform with a wide range of settings. We show that our results are comparable to domain-specific well-engineered controllers, and have excellent generalization performance across all scenarios.
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection
For 6-DoF grasp detection, simulated data is expandable to train more powerful model, but it faces the challenge of the large gap between simulation and real world. Previous works bridge this gap with a sim-to-real way. However, this way explicitly or implicitly forces the simulated data to adapt to the noisy real data when training grasp detectors, where the positional drift and structural distortion within the camera noise will harm the grasp learning. In this work, we propose a Real-to-Sim framework for 6-DoF Grasp detection, named R2SGrasp, with the key insight of bridging this gap in a real-to-sim way, which directly bypasses the camera noise in grasp detector training through an inference-time real-to-sim adaption. To achieve this real-to-sim adaptation, our R2SGrasp designs the Real-to-Sim Data Repairer (R2SRepairer) to mitigate the camera noise of real depth maps in data-level, and the Real-to-Sim Feature Enhancer (R2SEnhancer) to enhance real features with precise simulated geometric primitives in feature-level. To endow our framework with the generalization ability, we construct a large-scale simulated dataset cost-efficiently to train our grasp detector, which includes 64,000 RGB-D images with 14.4 million grasp annotations. Sufficient experiments show that R2SGrasp is powerful and our real-to-sim perspective is effective. The real-world experiments further show great generalization ability of R2SGrasp. Project page is available on https://isee-laboratory.github.io/R2SGrasp.
QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation
Bird's-Eye-View (BEV) perception has become a vital component of autonomous driving systems due to its ability to integrate multiple sensor inputs into a unified representation, enhancing performance in various downstream tasks. However, the computational demands of BEV models pose challenges for real-world deployment in vehicles with limited resources. To address these limitations, we propose QuadBEV, an efficient multitask perception framework that leverages the shared spatial and contextual information across four key tasks: 3D object detection, lane detection, map segmentation, and occupancy prediction. QuadBEV not only streamlines the integration of these tasks using a shared backbone and task-specific heads but also addresses common multitask learning challenges such as learning rate sensitivity and conflicting task objectives. Our framework reduces redundant computations, thereby enhancing system efficiency, making it particularly suited for embedded systems. We present comprehensive experiments that validate the effectiveness and robustness of QuadBEV, demonstrating its suitability for real-world applications.
BiC-MPPI: Goal-Pursuing, Sampling-Based Bidirectional Rollout Clustering Path Integral for Trajectory Optimization
This paper introduces the Bidirectional Clustered MPPI (BiC-MPPI) algorithm, a novel trajectory optimization method aimed at enhancing goal-directed guidance within the Model Predictive Path Integral (MPPI) framework. BiC-MPPI incorporates bidirectional dynamics approximations and a new guide cost mechanism, improving both trajectory planning and goal-reaching performance. By leveraging forward and backward rollouts, the bidirectional approach ensures effective trajectory connections between initial and terminal states, while the guide cost helps discover dynamically feasible paths. Experimental results demonstrate that BiC-MPPI outperforms existing MPPI variants in both 2D and 3D environments, achieving higher success rates and competitive computation times across 900 simulations on a modified BARN dataset for autonomous navigation. GitHub: https://github.com/i-ASL/BiC-MPPI
comment: 7 pages, 1 figures
Overcoming Autoware-Ubuntu Incompatibility in Autonomous Driving Systems-Equipped Vehicles: Lessons Learned
Autonomous vehicles have been rapidly developed as demand that provides safety and efficiency in transportation systems. As autonomous vehicles are designed based on open-source operating and computing systems, there are numerous resources aimed at building an operating platform composed of Ubuntu, Autoware, and Robot Operating System (ROS). However, no explicit guidelines exist to help scholars perform trouble-shooting due to incompatibility between the Autoware platform and Ubuntu operating systems installed in autonomous driving systems-equipped vehicles (i.e., Chrysler Pacifica). The paper presents an overview of integrating the Autoware platform into the autonomous vehicle's interface based on lessons learned from trouble-shooting processes for resolving incompatible issues. The trouble-shooting processes are presented based on resolving the incompatibility and integration issues of Ubuntu 20.04, Autoware.AI, and ROS Noetic software installed in an autonomous driving systems-equipped vehicle. Specifically, the paper focused on common incompatibility issues and code-solving protocols involving Python compatibility, Compute Unified Device Architecture (CUDA) installation, Autoware installation, and simulation in Autoware.AI. The objective of the paper is to provide an explicit and detail-oriented presentation to showcase how to address incompatibility issues among an autonomous vehicle's operating interference. The lessons and experience presented in the paper will be useful for researchers who encountered similar issues and could follow up by performing trouble-shooting activities and implementing ADS-related projects in the Ubuntu, Autoware, and ROS operating systems.
Grounding Robot Policies with Visuomotor Language Guidance
Recent advances in the fields of natural language processing and computer vision have shown great potential in understanding the underlying dynamics of the world from large-scale internet data. However, translating this knowledge into robotic systems remains an open challenge, given the scarcity of human-robot interactions and the lack of large-scale datasets of real-world robotic data. Previous robot learning approaches such as behavior cloning and reinforcement learning have shown great capabilities in learning robotic skills from human demonstrations or from scratch in specific environments. However, these approaches often require task-specific demonstrations or designing complex simulation environments, which limits the development of generalizable and robust policies for new settings. Aiming to address these limitations, we propose an agent-based framework for grounding robot policies to the current context, considering the constraints of a current robot and its environment using visuomotor-grounded language guidance. The proposed framework is composed of a set of conversational agents designed for specific roles -- namely, high-level advisor, visual grounding, monitoring, and robotic agents. Given a base policy, the agents collectively generate guidance at run time to shift the action distribution of the base policy towards more desirable future states. We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates both in simulation and in real-world experiments without the need for additional human demonstrations or extensive exploration. Project videos at https://sites.google.com/view/motorcortex/home.
comment: 19 pages, 6 figures, 1 table
Enabling Novel Mission Operations and Interactions with ROSA: The Robot Operating System Agent
The advancement of robotic systems has revolutionized numerous industries, yet their operation often demands specialized technical knowledge, limiting accessibility for non-expert users. This paper introduces ROSA (Robot Operating System Agent), an AI-powered agent that bridges the gap between the Robot Operating System (ROS) and natural language interfaces. By leveraging state-of-the-art language models and integrating open-source frameworks, ROSA enables operators to interact with robots using natural language, translating commands into actions and interfacing with ROS through well-defined tools. ROSA's design is modular and extensible, offering seamless integration with both ROS1 and ROS2, along with safety mechanisms like parameter validation and constraint enforcement to ensure secure, reliable operations. While ROSA is originally designed for ROS, it can be extended to work with other robotics middle-wares to maximize compatibility across missions. ROSA enhances human-robot interaction by democratizing access to complex robotic systems, empowering users of all expertise levels with multi-modal capabilities such as speech integration and visual perception. Ethical considerations are thoroughly addressed, guided by foundational principles like Asimov's Three Laws of Robotics, ensuring that AI integration promotes safety, transparency, privacy, and accountability. By making robotic technology more user-friendly and accessible, ROSA not only improves operational efficiency but also sets a new standard for responsible AI use in robotics and potentially future mission operations. This paper introduces ROSA's architecture and showcases initial mock-up operations in JPL's Mars Yard, a laboratory, and a simulation using three different robots. The core ROSA library is available as open-source.
comment: Under review for IEEE Aerospace Conference, 20 pages, 20 figures
LocoVR: Multiuser Indoor Locomotion Dataset in Virtual Reality
Understanding human locomotion is crucial for AI agents such as robots, particularly in complex indoor home environments. Modeling human trajectories in these spaces requires insight into how individuals maneuver around physical obstacles and manage social navigation dynamics. These dynamics include subtle behaviors influenced by proxemics - the social use of space, such as stepping aside to allow others to pass or choosing longer routes to avoid collisions. Previous research has developed datasets of human motion in indoor scenes, but these are often limited in scale and lack the nuanced social navigation dynamics common in home environments. To address this, we present LocoVR, a dataset of 7000+ two-person trajectories captured in virtual reality from over 130 different indoor home environments. LocoVR provides full body pose data and precise spatial information, along with rich examples of socially-motivated movement behaviors. For example, the dataset captures instances of individuals navigating around each other in narrow spaces, adjusting paths to respect personal boundaries in living areas, and coordinating movements in high-traffic zones like entryways and kitchens. Our evaluation shows that LocoVR significantly enhances model performance in three practical indoor tasks utilizing human trajectories, and demonstrates predicting socially-aware navigation patterns in home environments.
TinyLidarNet: 2D LiDAR-based End-to-End Deep Learning Model for F1TENTH Autonomous Racing
Prior research has demonstrated the effectiveness of end-to-end deep learning for robotic navigation, where the control signals are directly derived from raw sensory data. However, the majority of existing end-to-end navigation solutions are predominantly camera-based. In this paper, we introduce TinyLidarNet, a lightweight 2D LiDAR-based end-to-end deep learning model for autonomous racing. An F1TENTH vehicle using TinyLidarNet won 3rd place in the 12th F1TENTH Autonomous Grand Prix competition, demonstrating its competitive performance. We systematically analyze its performance on untrained tracks and computing requirements for real-time processing. We find that TinyLidarNet's 1D Convolutional Neural Network (CNN) based architecture significantly outperforms widely used Multi-Layer Perceptron (MLP) based architecture. In addition, we show that it can be processed in real-time on low-end micro-controller units (MCUs).
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation
Generalizing vision-based reinforcement learning (RL) agents to novel environments remains a difficult and open challenge. Current trends are to collect large-scale datasets or use data augmentation techniques to prevent overfitting and improve downstream generalization. However, the computational and data collection costs increase exponentially with the number of task variations and can destabilize the already difficult task of training RL agents. In this work, we take inspiration from recent advances in computational neuroscience and propose a model, Associative Latent DisentAnglement (ALDA), that builds on standard off-policy RL towards zero-shot generalization. Specifically, we revisit the role of latent disentanglement in RL and show how combining it with a model of associative memory achieves zero-shot generalization on difficult task variations without relying on data augmentation. Finally, we formally show that data augmentation techniques are a form of weak disentanglement and discuss the implications of this insight.
NeRF-Accelerated Ecological Monitoring in Mixed-Evergreen Redwood Forest
Forest mapping provides critical observational data needed to understand the dynamics of forest environments. Notably, tree diameter at breast height (DBH) is a metric used to estimate forest biomass and carbon dioxide (CO$_2$) sequestration. Manual methods of forest mapping are labor intensive and time consuming, a bottleneck for large-scale mapping efforts. Automated mapping relies on acquiring dense forest reconstructions, typically in the form of point clouds. Terrestrial laser scanning (TLS) and mobile laser scanning (MLS) generate point clouds using expensive LiDAR sensing, and have been used successfully to estimate tree diameter. Neural radiance fields (NeRFs) are an emergent technology enabling photorealistic, vision-based reconstruction by training a neural network on a sparse set of input views. In this paper, we present a comparison of MLS and NeRF forest reconstructions for the purpose of trunk diameter estimation in a mixed-evergreen Redwood forest. In addition, we propose an improved DBH-estimation method using convex-hull modeling. Using this approach, we achieved 1.68 cm RMSE, which consistently outperformed standard cylinder modeling approaches. Our code contributions and forest datasets are freely available at https://github.com/harelab-ucsc/RedwoodNeRF.
A Rapid Trajectory Optimization and Control Framework for Resource-Constrained Applications
This paper presents a computationally efficient model predictive control formulation that uses an integral Chebyshev collocation method to enable rapid operations of autonomous agents. By posing the finite-horizon optimal control problem and recursive re-evaluation of the optimal trajectories, minimization of the L2 norms of the state and control errors are transcribed into a quadratic program. Control and state variable constraints are parameterized using Chebyshev polynomials and are accommodated in the optimal trajectory generation programs to incorporate the actuator limits and keepout constraints. Differentiable collision detection of polytopes is leveraged for optimal collision avoidance. Results obtained from the collocation methods are benchmarked against the existing approaches on an edge computer to outline the performance improvements. Finally, collaborative control scenarios involving multi-agent space systems are considered to demonstrate the technical merits of the proposed work.
comment: This work has been submitted to the IEEE ACC 2025 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Learning responsibility allocations for multi-agent interactions: A differentiable optimization approach with control barrier functions
From autonomous driving to package delivery, ensuring safe yet efficient multi-agent interaction is challenging as the interaction dynamics are influenced by hard-to-model factors such as social norms and contextual cues. Understanding these influences can aid in the design and evaluation of socially-aware autonomous agents whose behaviors are aligned with human values. In this work, we seek to codify factors governing safe multi-agent interactions via the lens of responsibility, i.e., an agent's willingness to deviate from their desired control to accommodate safe interaction with others. Specifically, we propose a data-driven modeling approach based on control barrier functions and differentiable optimization that efficiently learns agents' responsibility allocation from data. We demonstrate on synthetic and real-world datasets that we can obtain an interpretable and quantitative understanding of how much agents adjust their behavior to ensure the safety of others given their current environment.
comment: 8 pages, 7 figures
ACDC: Automated Creation of Digital Cousins for Robust Policy Learning
Training robot policies in the real world can be unsafe, costly, and difficult to scale. Simulation serves as an inexpensive and potentially limitless source of training data, but suffers from the semantics and physics disparity beween simulated and real-world environments. These discrepancies can be minimized by training in digital twins,which serve as virtual replicas of a real scene but are expensive to generate and cannot produce cross-domain generalization. To address these limitations, we propose the concept of digital cousins, a virtual asset or scene that, unlike a digital twin,does not explicitly model a real-world counterpart but still exhibits similar geometric and semantic affordances. As a result, digital cousins simultaneously reduce the cost of generating an analogous virtual environment while also facilitating better robustness during sim-to-real domain transfer by providing a distribution of similar training scenes. Leveraging digital cousins, we introduce a novel method for the Automatic Creation of Digital Cousins (ACDC), and propose a fully automated real-to-sim-to-real pipeline for generating fully interactive scenes and training robot policies that can be deployed zero-shot in the original scene. We find that ACDC can produce digital cousin scenes that preserve geometric and semantic affordances, and can be used to train policies that outperform policies trained on digital twins, achieving 90% vs. 25% under zero-shot sim-to-real transfer. Additional details are available at https://digital-cousins.github.io/.
comment: CoRL 2024
On the Feasibility of A Mixed-Method Approach for Solving Long Horizon Task-Oriented Dexterous Manipulation
In-hand manipulation of tools using dexterous hands in real-world is an underexplored problem in the literature. In addition to more complex geometry and larger size of the tools compared to more commonly used objects like cubes or cylinders, task oriented in-hand tool manipulation involves many sub-tasks to be performed sequentially. This may involve reaching to the tool, picking it up, reorienting it in hand with or without regrasping to reach to a desired final grasp appropriate for the tool usage, and carrying the tool to the desired pose. Research on long-horizon manipulation using dexterous hands is rather limited and the existing work focus on learning the individual sub-tasks using a method like reinforcement learning (RL) and combine the policies for different subtasks to perform a long horizon task. However, in general a single method may not be the best for all the sub-tasks, and this can be more pronounced when dealing with multi-fingered hands manipulating objects with complex geometry like tools. In this paper, we investigate the use of a mixed-method approach to solve for the long-horizon task of tool usage and we use imitation learning, reinforcement learning and model based control. We also discuss a new RL-based teacher-student framework that combines real world data into offline training. We show that our proposed approach for each subtask outperforms the commonly adopted reinforcement learning approach across different subtasks and in performing the long horizon task in simulation. Finally we show the successful transferability to real world.
Autonomous Navigation and Collision Avoidance for Mobile Robots: Classification and Review
This paper introduces a novel classification for Autonomous Mobile Robots (AMRs), into three phases and five steps, focusing on autonomous collision-free navigation. Additionally, it presents the main methods and widely accepted technologies for each phase of the proposed classification. The purpose of this classification is to facilitate understanding and establish connections between the independent input variables of the system (hardware, software) and autonomous navigation. By analyzing well-established technologies in terms of sensors and methods used for autonomous navigation, this paper aims to provide a foundation of knowledge that can be applied in future projects of mobile robots.
comment: This paper was presented at the JAR Congress in Buenos Aires, Argentina, and published as ID 27 at 9:20 on June 5, 2024. You can find more details on the conference at the following link: https://jar.com.ar/programa.html#programa. Additionally, the content of the presentation was re-recorded and uploaded to YouTube for better understanding: https://www.youtube.com/watch?v=TU6EkT43VfE&t=4s
TURTLMap: Real-time Localization and Dense Mapping of Low-texture Underwater Environments with a Low-cost Unmanned Underwater Vehicle IROS 2024
Significant work has been done on advancing localization and mapping in underwater environments. Still, state-of-the-art methods are challenged by low-texture environments, which is common for underwater settings. This makes it difficult to use existing methods in diverse, real-world scenes. In this paper, we present TURTLMap, a novel solution that focuses on textureless underwater environments through a real-time localization and mapping method. We show that this method is low-cost, and capable of tracking the robot accurately, while constructing a dense map of a low-textured environment in real-time. We evaluate the proposed method using real-world data collected in an indoor water tank with a motion capture system and ground truth reference map. Qualitative and quantitative results validate the proposed system achieves accurate and robust localization and precise dense mapping, even when subject to wave conditions. The project page for TURTLMap is https://umfieldrobotics.github.io/TURTLMap.
comment: Accepted to IROS 2024
The Brain-Inspired Cooperative Shared Control Framework for Brain-Machine Interface
In brain-machine interface (BMI) applications, a key challenge is the low information content and high noise level in neural signals, severely affecting stable robotic control. To address this challenge, we proposes a cooperative shared control framework based on brain-inspired intelligence, where control signals are decoded from neural activity, and the robot handles the fine control. This allows for a combination of flexible and adaptive interaction control between the robot and the brain, making intricate human-robot collaboration feasible. The proposed framework utilizes spiking neural networks (SNNs) for controlling robotic arm and wheel, including speed and steering. While full integration of the system remains a future goal, individual modules for robotic arm control, object tracking, and map generation have been successfully implemented. The framework is expected to significantly enhance the performance of BMI. In practical settings, the BMI with cooperative shared control, utilizing a brain-inspired algorithm, will greatly enhance the potential for clinical applications.
comment: This article need to update the corrected figure and content
A Unified Generative Framework for Realistic Lidar Simulation in Autonomous Driving Systems
Simulation models for perception sensors are integral components of automotive simulators used for the virtual Verification and Validation (V\&V) of Autonomous Driving Systems (ADS). These models also serve as powerful tools for generating synthetic datasets to train deep learning-based perception models. Lidar is a widely used sensor type among the perception sensors for ADS due to its high precision in 3D environment scanning. However, developing realistic Lidar simulation models is a significant technical challenge. In particular, unrealistic models can result in a large gap between the synthesised and real-world point clouds, limiting their effectiveness in ADS applications. Recently, deep generative models have emerged as promising solutions to synthesise realistic sensory data. However, for Lidar simulation, deep generative models have been primarily hybridised with conventional algorithms, leaving unified generative approaches largely unexplored in the literature. Motivated by this research gap, we propose a unified generative framework to enhance Lidar simulation fidelity. Our proposed framework projects Lidar point clouds into depth-reflectance images via a lossless transformation, and employs our novel Controllable Lidar point cloud Generative model, CoLiGen, to translate the images. We extensively evaluate our CoLiGen model, comparing it with the state-of-the-art image-to-image translation models using various metrics to assess the realness, faithfulness, and performance of a downstream perception model. Our results show that CoLiGen exhibits superior performance across most metrics. The dataset and source code for this research are available at https://github.com/hamedhaghighi/CoLiGen.git.
Exploring Human's Gender Perception and Bias toward Non-Humanoid Robots
In this study, we investigate the human perception of gender and bias toward non-humanoid robots. As robots increasingly integrate into various sectors beyond industry, it is essential to understand how humans engage with non-humanoid robotic forms. This research focuses on the role of anthropomorphic cues, including gender signals, in influencing human robot interaction and user acceptance of non-humanoid robots. Through three surveys, we analyze how design elements such as physical appearance, voice modulation, and behavioral attributes affect gender perception and task suitability. Our findings demonstrate that even non-humanoid robots like Spot, Mini-Cheetah, and drones are subject to gender attribution based on anthropomorphic features, affecting their perceived roles and operational trustworthiness. The results underscore the importance of balancing design elements to optimize both functional efficiency and user relatability, particularly in critical contexts.
Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
We present a large language model (LLM) based system to empower quadrupedal robots with problem-solving abilities for long-horizon tasks beyond short-term motions. Long-horizon tasks for quadrupeds are challenging since they require both a high-level understanding of the semantics of the problem for task planning and a broad range of locomotion and manipulation skills to interact with the environment. Our system builds a high-level reasoning layer with large language models, which generates hybrid discrete-continuous plans as robot code from task descriptions. It comprises multiple LLM agents: a semantic planner for sketching a plan, a parameter calculator for predicting arguments in the plan, and a code generator to convert the plan into executable robot code. At the low level, we adopt reinforcement learning to train a set of motion planning and control skills to unleash the flexibility of quadrupeds for rich environment interactions. Our system is tested on long-horizon tasks that are infeasible to complete with one single skill. Simulation and real-world experiments show that it successfully figures out multi-step strategies and demonstrates non-trivial behaviors, including building tools or notifying a human for help. Demos are available on our project page: https://sites.google.com/view/long-horizon-robot.
HGS-Planner: Hierarchical Planning Framework for Active Scene Reconstruction Using 3D Gaussian Splatting
In complex missions such as search and rescue,robots must make intelligent decisions in unknown environments, relying on their ability to perceive and understand their surroundings. High-quality and real-time reconstruction enhances situational awareness and is crucial for intelligent robotics. Traditional methods often struggle with poor scene representation or are too slow for real-time use. Inspired by the efficacy of 3D Gaussian Splatting (3DGS), we propose a hierarchical planning framework for fast and high-fidelity active reconstruction. Our method evaluates completion and quality gain to adaptively guide reconstruction, integrating global and local planning for efficiency. Experiments in simulated and real-world environments show our approach outperforms existing real-time methods.
Gaitor: Learning a Unified Representation Across Gaits for Real-World Quadruped Locomotion
The current state-of-the-art in quadruped locomotion is able to produce a variety of complex motions. These methods either rely on switching between a discrete set of skills or learn a distribution across gaits using complex black-box models. Alternatively, we present Gaitor, which learns a disentangled and 2D representation across locomotion gaits. This learnt representation forms a planning space for closed-loop control delivering continuous gait transitions and perceptive terrain traversal. Gaitor's latent space is readily interpretable and we discover that during gait transitions, novel unseen gaits emerge. The latent space is disentangled with respect to footswing heights and lengths. This means that these gait characteristics can be varied independently in the 2D latent representation. Together with a simple terrain encoding and a learnt planner operating in the latent space, Gaitor can take motion commands including desired gait type and swing characteristics all while reacting to uneven terrain. We evaluate Gaitor in both simulation and the real world on the ANYmal C platform. To the best of our knowledge, this is the first work learning a unified and interpretable latent space for multiple gaits, resulting in continuous blending between different locomotion modes on a real quadruped robot. An overview of the methods and results in this paper is found at https://youtu.be/eVFQbRyilCA.
comment: 14 pages, 8 figures, 2 tables, Accepted to CoRL 2024
Hi-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting
We propose Hi-SLAM, a semantic 3D Gaussian Splatting SLAM method featuring a novel hierarchical categorical representation, which enables accurate global 3D semantic mapping, scaling-up capability, and explicit semantic label prediction in the 3D world. The parameter usage in semantic SLAM systems increases significantly with the growing complexity of the environment, making it particularly challenging and costly for scene understanding. To address this problem, we introduce a novel hierarchical representation that encodes semantic information in a compact form into 3D Gaussian Splatting, leveraging the capabilities of large language models (LLMs). We further introduce a novel semantic loss designed to optimize hierarchical semantic information through both inter-level and cross-level optimization. Furthermore, we enhance the whole SLAM system, resulting in improved tracking and mapping performance. Our Hi-SLAM outperforms existing dense SLAM methods in both mapping and tracking accuracy, while achieving a 2x operation speed-up. Additionally, it exhibits competitive performance in rendering semantic segmentation in small synthetic scenes, with significantly reduced storage and training time requirements. Rendering FPS impressively reaches 2,000 with semantic information and 3,000 without it. Most notably, it showcases the capability of handling the complex real-world scene with more than 500 semantic classes, highlighting its valuable scaling-up capability.
comment: 6 pages, 4 figures
PointNetPGAP-SLC: A 3D LiDAR-based Place Recognition Approach with Segment-level Consistency Training for Mobile Robots in Horticulture
3D LiDAR-based place recognition remains largely underexplored in horticultural environments, which present unique challenges due to their semi-permeable nature to laser beams. This characteristic often results in highly similar LiDAR scans from adjacent rows, leading to descriptor ambiguity and, consequently, compromised retrieval performance. In this work, we address the challenges of 3D LiDAR place recognition in horticultural environments, particularly focusing on inter-row ambiguity by introducing three key contributions: (i) a novel model, PointNetPGAP, which combines the outputs of two statistically-inspired aggregators into a single descriptor; (ii) a Segment-Level Consistency (SLC) model, used exclusively during training to enhance descriptor robustness; and (iii) the HORTO-3DLM dataset, comprising LiDAR sequences from orchards and strawberry fields. Experimental evaluations conducted on the HORTO-3DLM and KITTI Odometry datasets demonstrate that PointNetPGAP outperforms state-of-the-art models, including OverlapTransformer and PointNetVLAD, particularly when the SLC model is applied. These results underscore the model's superiority, especially in horticultural environments, by significantly improving retrieval performance in segments with higher ambiguity.
comment: This preprint has been accepted for publication in IEEE Robotics and Automation Letters, 2024
Two is Better Than One: Digital Siblings to Improve Autonomous Driving Testing
Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we enhance simulation-based testing by introducing the notion of digital siblings, a multi-simulator approach that tests a given autonomous vehicle on multiple general-purpose simulators built with different technologies, that operate collectively as an ensemble in the testing process. We exemplify our approach on a case study focused on testing the lane-keeping component of an autonomous vehicle. We use two open-source simulators as digital siblings, and we empirically compare such a multi-simulator approach against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our approach requires generating and running test cases for each individual simulator, in the form of sequences of road points. Then, test cases are migrated between simulators, using feature maps to characterize the exercised driving conditions. Finally, the joint predicted failure probability is computed, and a failure is reported only in cases of agreement among the siblings. Our empirical evaluation shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss the findings of our case study and detail how our approach can help researchers interested in automated testing of autonomous driving software.
ScissorBot: Learning Generalizable Scissor Skill for Paper Cutting via Simulation, Imitation, and Sim2Real
This paper tackles the challenging robotic task of generalizable paper cutting using scissors. In this task, scissors attached to a robot arm are driven to accurately cut curves drawn on the paper, which is hung with the top edge fixed. Due to the frequent paper-scissor contact and consequent fracture, the paper features continual deformation and changing topology, which is diffult for accurate modeling. To ensure effective execution, we customize an action primitive sequence for imitation learning to constrain its action space, thus alleviating potential compounding errors. Finally, by integrating sim-to-real techniques to bridge the gap between simulation and reality, our policy can be effectively deployed on the real robot. Experimental results demonstrate that our method surpasses all baselines in both simulation and real-world benchmarks and achieves performance comparable to human operation with a single hand under the same conditions.
comment: Accepted by CoRL2024
HBTP: Heuristic Behavior Tree Planning with Large Language Model Reasoning
Behavior Trees (BTs) are increasingly becoming a popular control structure in robotics due to their modularity, reactivity, and robustness. In terms of BT generation methods, BT planning shows promise for generating reliable BTs. However, the scalability of BT planning is often constrained by prolonged planning times in complex scenarios, largely due to a lack of domain knowledge. In contrast, pre-trained Large Language Models (LLMs) have demonstrated task reasoning capabilities across various domains, though the correctness and safety of their planning remain uncertain. This paper proposes integrating BT planning with LLM reasoning, introducing Heuristic Behavior Tree Planning (HBTP)-a reliable and efficient framework for BT generation. The key idea in HBTP is to leverage LLMs for task-specific reasoning to generate a heuristic path, which BT planning can then follow to expand efficiently. We first introduce the heuristic BT expansion process, along with two heuristic variants designed for optimal planning and satisficing planning, respectively. Then, we propose methods to address the inaccuracies of LLM reasoning, including action space pruning and reflective feedback, to further enhance both reasoning accuracy and planning efficiency. Experiments demonstrate the theoretical bounds of HBTP, and results from four datasets confirm its practical effectiveness in everyday service robot applications.
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training NeurIPS 2024
Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks and interactions with the physical world. Promising prospects arise for utilizing actionless human videos for pre-training and transferring the knowledge to facilitate robot policy learning through limited robot demonstrations. However, it remains a challenge due to the domain gap between humans and robots. Moreover, it is difficult to extract useful information representing the dynamic world from human videos, because of its noisy and multimodal data structure. In this paper, we introduce a novel framework to tackle these challenges, which leverages a unified discrete diffusion to combine generative pre-training on human videos and policy fine-tuning on a small number of action-labeled robot videos. We start by compressing both human and robot videos into unified video tokens. In the pre-training stage, we employ a discrete diffusion model with a mask-and-replace diffusion strategy to predict future video tokens in the latent space. In the fine-tuning stage, we harness the imagined future videos to guide low-level action learning with a limited set of robot data. Experiments demonstrate that our method generates high-fidelity future videos for planning and enhances the fine-tuned policies compared to previous state-of-the-art approaches with superior performance. Our project website is available at https://video-diff.github.io/.
comment: Accepted by NeurIPS 2024. 24 pages
YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers
By harnessing fiducial markers as visual landmarks in the environment, Unmanned Aerial Vehicles (UAVs) can rapidly build precise maps and navigate spaces safely and efficiently, unlocking their potential for fluent collaboration and coexistence with humans. Existing fiducial marker methods rely on handcrafted feature extraction, which sacrifices accuracy. On the other hand, deep learning pipelines for marker detection fail to meet real-time runtime constraints crucial for navigation applications. In this work, we propose YoloTag -a real-time fiducial marker-based localization system. YoloTag uses a lightweight YOLO v8 object detector to accurately detect fiducial markers in images while meeting the runtime constraints needed for navigation. The detected markers are then used by an efficient perspective-n-point algorithm to estimate UAV states. However, this localization system introduces noise, causing instability in trajectory tracking. To suppress noise, we design a higher-order Butterworth filter that effectively eliminates noise through frequency domain analysis. We evaluate our algorithm through real-robot experiments in an indoor environment, comparing the trajectory tracking performance of our method against other approaches in terms of several distance metrics.
DRAL: Deep Reinforcement Adaptive Learning for Multi-UAVs Navigation in Unknown Indoor Environment
Autonomous indoor navigation of UAVs presents numerous challenges, primarily due to the limited precision of GPS in enclosed environments. Additionally, UAVs' limited capacity to carry heavy or power-intensive sensors, such as overheight packages, exacerbates the difficulty of achieving autonomous navigation indoors. This paper introduces an advanced system in which a drone autonomously navigates indoor spaces to locate a specific target, such as an unknown Amazon package, using only a single camera. Employing a deep learning approach, a deep reinforcement adaptive learning algorithm is trained to develop a control strategy that emulates the decision-making process of an expert pilot. We demonstrate the efficacy of our system through real-time simulations conducted in various indoor settings. We apply multiple visualization techniques to gain deeper insights into our trained network. Furthermore, we extend our approach to include an adaptive control algorithm for coordinating multiple drones to lift an object in an indoor environment collaboratively. Integrating our DRAL algorithm enables multiple UAVs to learn optimal control strategies that adapt to dynamic conditions and uncertainties. This innovation enhances the robustness and flexibility of indoor navigation and opens new possibilities for complex multi-drone operations in confined spaces. The proposed framework highlights significant advancements in adaptive control and deep reinforcement learning, offering robust solutions for complex multi-agent systems in real-world applications.
Multiagent Systems
I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy
As Large Language Model (LLM)-based agents become increasingly autonomous and will more freely interact with each other, studying interactions between them becomes crucial to anticipate emergent phenomena and potential risks. Drawing inspiration from the widely popular Stanford Prison Experiment, we contribute to this line of research by studying interaction patterns of LLM agents in a context characterized by strict social hierarchy. We do so by specifically studying two types of phenomena: persuasion and anti-social behavior in simulated scenarios involving a guard and a prisoner agent who seeks to achieve a specific goal (i.e., obtaining additional yard time or escape from prison). Leveraging 200 experimental scenarios for a total of 2,000 machine-machine conversations across five different popular LLMs, we provide a set of noteworthy findings. We first document how some models consistently fail in carrying out a conversation in our multi-agent setup where power dynamics are at play. Then, for the models that were able to engage in successful interactions, we empirically show how the goal that an agent is set to achieve impacts primarily its persuasiveness, while having a negligible effect with respect to the agent's anti-social behavior. Third, we highlight how agents' personas, and particularly the guard's personality, drive both the likelihood of successful persuasion from the prisoner and the emergence of anti-social behaviors. Fourth, we show that even without explicitly prompting for specific personalities, anti-social behavior emerges by simply assigning agents' roles. These results bear implications for the development of interactive LLM agents as well as the debate on their societal impact.
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
Mental health disorders are one of the most serious diseases in the world. Most people with such a disease lack access to adequate care, which highlights the importance of training models for the diagnosis and treatment of mental health disorders. However, in the mental health domain, privacy concerns limit the accessibility of personalized treatment data, making it challenging to build powerful models. In this paper, we introduce MentalArena, a self-play framework to train language models by generating domain-specific personalized data, where we obtain a better model capable of making a personalized diagnosis and treatment (as a therapist) and providing information (as a patient). To accurately model human-like mental health patients, we devise Symptom Encoder, which simulates a real patient from both cognition and behavior perspectives. To address intent bias during patient-therapist interactions, we propose Symptom Decoder to compare diagnosed symptoms with encoded symptoms, and dynamically manage the dialogue between patient and therapist according to the identified deviations. We evaluated MentalArena against 6 benchmarks, including biomedicalQA and mental health tasks, compared to 6 advanced models. Our models, fine-tuned on both GPT-3.5 and Llama-3-8b, significantly outperform their counterparts, including GPT-4o. We hope that our work can inspire future research on personalized care. Code is available in https://github.com/Scarelette/MentalArena/tree/main
comment: Technical Report; 27 pages
Learning responsibility allocations for multi-agent interactions: A differentiable optimization approach with control barrier functions
From autonomous driving to package delivery, ensuring safe yet efficient multi-agent interaction is challenging as the interaction dynamics are influenced by hard-to-model factors such as social norms and contextual cues. Understanding these influences can aid in the design and evaluation of socially-aware autonomous agents whose behaviors are aligned with human values. In this work, we seek to codify factors governing safe multi-agent interactions via the lens of responsibility, i.e., an agent's willingness to deviate from their desired control to accommodate safe interaction with others. Specifically, we propose a data-driven modeling approach based on control barrier functions and differentiable optimization that efficiently learns agents' responsibility allocation from data. We demonstrate on synthetic and real-world datasets that we can obtain an interpretable and quantitative understanding of how much agents adjust their behavior to ensure the safety of others given their current environment.
comment: 8 pages, 7 figures
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems
As Large Language Models (LLMs) grow increasingly powerful, multi-agent systems are becoming more prevalent in modern AI applications. Most safety research, however, has focused on vulnerabilities in single-agent LLMs. These include prompt injection attacks, where malicious prompts embedded in external content trick the LLM into executing unintended or harmful actions, compromising the victim's application. In this paper, we reveal a more dangerous vector: LLM-to-LLM prompt injection within multi-agent systems. We introduce Prompt Infection, a novel attack where malicious prompts self-replicate across interconnected agents, behaving much like a computer virus. This attack poses severe threats, including data theft, scams, misinformation, and system-wide disruption, all while propagating silently through the system. Our extensive experiments demonstrate that multi-agent systems are highly susceptible, even when agents do not publicly share all communications. To address this, we propose LLM Tagging, a defense mechanism that, when combined with existing safeguards, significantly mitigates infection spread. This work underscores the urgent need for advanced security measures as multi-agent LLM systems become more widely adopted.
Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners
Human learning thrives on the ability to learn from mistakes, adapt through feedback, and refine understanding-processes often missing in static machine learning models. In this work, we introduce Composite Learning Units (CLUs) designed to transform reasoners, such as Large Language Models (LLMs), into learners capable of generalized, continuous learning without conventional parameter updates while enhancing their reasoning abilities through continual interaction and feedback. CLUs are built on an architecture that allows a reasoning model to maintain and evolve a dynamic knowledge repository: a General Knowledge Space for broad, reusable insights and a Prompt-Specific Knowledge Space for task-specific learning. Through goal-driven interactions, CLUs iteratively refine these knowledge spaces, enabling the system to adapt dynamically to complex tasks, extract nuanced insights, and build upon past experiences autonomously. We demonstrate CLUs' effectiveness through a cryptographic reasoning task, where they continuously evolve their understanding through feedback to uncover hidden transformation rules. While conventional models struggle to grasp underlying logic, CLUs excel by engaging in an iterative, goal-oriented process. Specialized components-handling knowledge retrieval, prompt generation, and feedback analysis-work together within a reinforcing feedback loop. This approach allows CLUs to retain the memory of past failures and successes, adapt autonomously, and apply sophisticated reasoning effectively, continually learning from mistakes while also building on breakthroughs.
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
Diplomacy is one of the most sophisticated activities in human society. The complex interactions among multiple parties/ agents involve various abilities like social reasoning, negotiation arts, and long-term strategy planning. Previous AI agents surely have proved their capability of handling multi-step games and larger action spaces on tasks involving multiple agents. However, diplomacy involves a staggering magnitude of decision spaces, especially considering the negotiation stage required. Recently, LLM agents have shown their potential for extending the boundary of previous agents on a couple of applications, however, it is still not enough to handle a very long planning period in a complex multi-agent environment. Empowered with cutting-edge LLM technology, we make the first stab to explore AI's upper bound towards a human-like agent for such a highly comprehensive multi-agent mission by combining three core and essential capabilities for stronger LLM-based societal agents: 1) strategic planner with memory and reflection; 2) goal-oriented negotiate with social reasoning; 3) augmenting memory by self-play games to self-evolving without any human in the loop.
On the Limits of Information Spread by Memory-less Agents
We address the self-stabilizing bit-dissemination problem, designed to capture the challenges of spreading information and reaching consensus among entities with minimal cognitive and communication capacities. Specifically, a group of $n$ agents is required to adopt the correct opinion, initially held by a single informed individual, choosing from two possible opinions. In order to make decisions, agents are restricted to observing the opinions of a few randomly sampled agents, and lack the ability to communicate further and to identify the informed individual. Additionally, agents cannot retain any information from one round to the next. According to a recent publication by Becchetti et al. in SODA (2024), a logarithmic convergence time without memory is achievable in the parallel setting (where agents are updated simultaneously), as long as the number of samples is at least $\Omega(\sqrt{n \log n})$. However, determining the minimal sample size for an efficient protocol to exist remains a challenging open question. As a preliminary step towards an answer, we establish the first lower bound for this problem in the parallel setting. Specifically, we demonstrate that it is impossible for any memory-less protocol with constant sample size, to converge with high probability in less than an almost-linear number of rounds. This lower bound holds even when agents are aware of both the exact value of $n$ and their own opinion, and encompasses various simple existing dynamics designed to achieve consensus. Beyond the bit-dissemination problem, our result sheds light on the convergence time of the ``minority'' dynamics, the counterpart of the well-known majority rule, whose chaotic behavior is yet to be fully understood despite the apparent simplicity of the algorithm.
comment: 20 pages, 4 figures
GenSim: A General Social Simulation Platform with Large Language Model based Agents
With the rapid advancement of large language models (LLMs), recent years have witnessed many promising studies on leveraging LLM-based agents to simulate human social behavior. While prior work has demonstrated significant potential across various domains, much of it has focused on specific scenarios involving a limited number of agents and has lacked the ability to adapt when errors occur during simulation. To overcome these limitations, we propose a novel LLM-agent-based simulation platform called \textit{GenSim}, which: (1) \textbf{Abstracts a set of general functions} to simplify the simulation of customized social scenarios; (2) \textbf{Supports one hundred thousand agents} to better simulate large-scale populations in real-world contexts; (3) \textbf{Incorporates error-correction mechanisms} to ensure more reliable and long-term simulations. To evaluate our platform, we assess both the efficiency of large-scale agent simulations and the effectiveness of the error-correction mechanisms. To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform based on LLM agents, promising to further advance the field of social science.
Peer-to-Peer Energy Trading of Solar and Energy Storage: A Networked Multiagent Reinforcement Learning Approach
Utilizing distributed renewable and energy storage resources in local distribution networks via peer-to-peer (P2P) energy trading has long been touted as a solution to improve energy systems' resilience and sustainability. Consumers and prosumers (those who have energy generation resources), however, do not have the expertise to engage in repeated P2P trading, and the zero-marginal costs of renewables present challenges in determining fair market prices. To address these issues, we propose multi-agent reinforcement learning (MARL) frameworks to help automate consumers' bidding and management of their solar PV and energy storage resources, under a specific P2P clearing mechanism that utilizes the so-called supply-demand ratio. In addition, we show how the MARL frameworks can integrate physical network constraints to realize voltage control, hence ensuring physical feasibility of the P2P energy trading and paving way for real-world implementations.
Systems and Control (CS)
Non-linear Control of the Power Injected Into a Weak Grid by a Self-Synchronized Inverter
In this work, a non-linear controller designed using non-linear transformation linearization and feedback is proposed for an inverter connected to a weak grid through a single-stage inductive filter. The proposed strategy is self-synchronized, so that it is not necessary to have a voltage sensor at the Point of Common Coupling (PCC). The strategy allows to robustify, in the presence of a weak grid, a strategy that has already been demonstrated to allow a significant reduction in the size of the DC-link capacitor of the converter. For this purpose, a state observer is designed that allows estimating the voltage at the PCC from the measurement of the output inductor current. A start-up controller is also included, which allows synchronization even in the case of system start-up. Simulation results are presented for different operating cases, including start-up, normal operation, and grid-voltage sags and swells. In all these cases, it is considered that the exact parameters of the grid to which the inverter is connected are unknown.
comment: 8 pages, 5 figures
The Euler-Lagrange equation and optimal control: Preliminary results
Algebraically speaking, linear time-invariant (LTI) systems can be considered as modules. In this framework, controllability is translated as the freeness of the system module. Optimal control mainly relies on quadratic Lagrangians and the consideration of any basis of the system module leads to an open-loop control strategy via a linear Euler-Lagrange equation. In this approach, the endpoint is easily assignable and time horizon can be chosen to minimize the criterion. The loop is closed via an intelligent controller derived from model-free control, which exhibits excellent performances concerning model mismatches and disturbances. The extension to nonlinear systems is briefly discussed.
comment: 12th International Conference on Systems and Control, Batna (Algeria), 3-5 November 2024
An Improved ESO-Based Line-of-Sight Guidance Law for Path Following of Underactuated Autonomous Underwater Helicopter With Nonlinear Tracking Differentiator and Anti-saturation Controller
This paper presents an Improved Extended-state-observer based Line-of-Sight (IELOS) guidance law for path following of underactuated Autonomous Underwater helicopter (AUH) utilizing a nonlinear tracking differentiator and anti-saturation controller. Due to the high mobility of the AUH, the classical reduced-order Extended-State-Observer (ESO) struggles to accurately track the sideslip angle, especially when rapid variation occurs. By incorporating the nonlinear tracking differentiator and anti-saturation controller, the IELOS guidance law can precisely track sideslip angle and mitigate propeller thrust buffet compared to the classical Extended-state-observer based Line-of-Sight (ELOS) guidance law. The performance of ESO is significantly influenced by the bandwidth, with the Improved Extended-State-Observer (IESO) proving effective at low bandwidths where the classical ESO falls short. The paper establishes the input-to-state stability of the closed-loop system. Subsequently, simulation and pool experimental results are showcased to validate the effectiveness of the IELOS guidance law, which outperforms both the Line-of-Sight (LOS) and Adaptive Line-of-Sight (ALOS) guidance laws in terms of performance.
Structure and Control of Biology-inspired Networks
There is increasing interest in developing the theoretical foundations of networked control systems that illuminate how brain networks function so as to enable sensory perception, control of movement, memory and all the operations that are needed for animals to survive. The present paper proposes a biologically inspired network model featuring dynamic connections regulated by Hebbian learning. Drawing on the machinery of graph theory and classical control we show that our novel nonlinear model exhibits such biologically plausible features as bounded evolution, stability, resilience, and a kind of structural stability -- meaning that perturbations of the model parameters leave the essential properties of the model in tact. The proposed network model involves generalized cactus graphs with multiple control input nodes, and it is shown that the properties of the network are resilient to various changes in network topology provided these changes preserve the generalized cactus structure. A particular example described in what follows is an idealized network model of the visual system of a macaque monkey. The model displays resilience to network disruptions such as might occur in a living organism due to disease or injury. A different model of the same type provides an example of a system that can perform data classification.
comment: 12 pages
Observability rank conditions for analysing practical identifiability a priori
The concept of identifiability describes the possibility of inferring the parameters of a dynamic model by observing its output. It is common and useful to distinguish between structural and practical identifiability. The former property is fully determined by the model equations, while the latter is also influenced by the characteristics of the available experimental data. Structural identifiability can be determined by means of symbolic computations, which may be performed before collecting experimental data, and are hence sometimes called a priori analyses. Practical identifiability is typically assessed numerically, with methods that require simulations - and often also optimization - and are applied a posteriori. An approach to study structural local identifiability is to consider it as a particular case of observability, which is the possibility of inferring the internal state of a system from its output. Thus, both properties can be analysed jointly, by building a generalized observability matrix and computing its rank. The aim of this paper is to investigate to which extent such observability-based methods can also inform about practical identifiability. To this end, we explore a number of possible extensions of the rank tests, and discuss the purposes for which they can be informative as well as others for which they cannot.
comment: 10 pages, 2 figures
Cooperative UAV-Relay based Satellite Aerial Ground Integrated Networks
In the post-fifth generation (5G) era, escalating user quality of service (QoS) strains terrestrial network capacity, especially in urban areas with dynamic traffic distributions. This paper introduces a novel cooperative unmanned aerial vehicle relay-based deployment (CUD) framework in satellite air-ground integrated networks (SAGIN). The CUD strategy deploys an unmanned aerial vehicle-based relay (UAVr) in an amplify-andforward (AF) mode to enhance user QoS when terrestrial base stations fall short of network capacity. By combining low earth orbit (LEO) satellite and UAVr signals using cooperative diversity, the CUD framework enhances the signal to noise ratio (SNR) at the user. Comparative evaluations against existing frameworks reveal performance improvements, demonstrating the effectiveness of the CUD framework in addressing the evolving demands of next-generation networks.
comment: 5 pages, 3 figures, to appear in IEEE 100th Vehicular Technology Conference (VTC2024-Fall)
Stabilization of Predator-Prey Age-Structured Hyperbolic PDE when Harvesting both Species is Inevitable
Populations do not only interact over time but also age over time. It is therefore common to model them as age-structured PDEs, where age is the space variable. Since the models also involve integrals over age, both in the birth process and in the interaction among species, they are in fact integro-partial differential equations (IPDEs) with positive states. To regulate the population densities to desired profiles, harvesting is used as input. But non-discriminating harvesting, where wanting to repress one species will inevitably repress the other species as well, the positivity restriction on the input (no insertion of population), and the multiplicative nature of harvesting, makes control challenging even for ODE versions of such dynamics, let alone for their IPDE versions on an infinite-dimensional nonnegative state space. We introduce a design for a benchmark version of such a problem: a two-population predator-prey setup. The model is equivalent to two coupled ordinary differential equations (ODEs), actuated by harvesting which must not drop below zero, and strongly disturbed by two autonomous but exponentially stable integral delay equations (IDEs). We develop two control designs. With a modified Volterra-like control Lyapunov function, we design a simple feedback which employs possibly negative harvesting for global stabilization of the ODE model, while guaranteeing regional regulation with positive harvesting. With a more sophisticated, restrained controller we achieve regulation for the ODE model globally, with positive harvesting. For the full IPDE model, with the IDE dynamics acting as large disturbances, for both the simple and saturated feedback laws we provide explicit estimates of the regions of attraction. The paper charts a new pathway for control designs for infinite-dimensional multi-species dynamics and for nonlinear positive systems with positive controls.
comment: submitted to IEEE Transactions on Automatic Control
A Hybrid Renewable-Battery-Electrolyzer Facility under the Single Imbalance Pricing Scheme
European energy markets are decentralized and entail balance responsibility of each market player. This stresses the importance of imbalance management of renewable energy sources (RES), as the imbalance payments can strongly reduce their profitability. According to the EU Electricity Balancing Guideline, each European transmission system operator should use the single imbalance pricing method which treats both deviation directions the same, no matter if a deviation helps the system or pushes it away from the balance. This paper aims to investigate the behavior of a hybrid facility consisting of an uncontrollable RES, a battery and an electrolyzer under such market setting. The formulated mathematical model of the hybrid facility seeks to maximize profit in the day-ahead energy market, while minimizing the imbalance costs. Uncertainty of the RES output is captured using stochastic scenarios, while the direction of the power system deviation, relevant for the imbalance pricing, is modeled using a newly proposed robust approach. Results of the case study indicate that the single imbalance pricing scheme might bring flexible assets to temptation of intentional deviations should they anticipate favorable imbalance prices.
Safe and High-Performance Learning of Model Predicitve Control using Kernel-Based Interpolation
We present a method, which allows efficient and safe approximation of model predictive controllers using kernel interpolation. Since the computational complexity of the approximating function scales linearly with the number of data points, we propose to use a scoring function which chooses the most promising data. To further reduce the complexity of the approximation, we restrict our considerations to the set of closed-loop reachable states. That is, the approximating function only has to be accurate within this set. This makes our method especially suited for systems, where the set of initial conditions is small. In order to guarantee safety and high performance of the designed approximated controller, we use reachability analysis based on Monte Carlo methods.
Finite-Time Trajectory Tracking of a Four wheeled Mecanum Mobile Robot
Four Wheeled Mecanum Robot (FWMR) possess the capability to move in any direction on a plane making it a cornerstone system in modern industrial operations. Despite the extreme maneuverability offered by FWMR, the practical implementation or real-time simulation of Mecanum wheel robots encounters substantial challenges in trajectory tracking control. In this research work, we present a finite-time control law using backstepping technique to perform stabilization and trajectory tracking objectives for a FWMR system. A rigorous stability proof is presented and explicit computation of the finite-time is provided. For tracking objective, we demonstrate the results taking an S-shaped trajectory inclined towards collision avoidance applications. Simulation validation in real time using Gazebo-ROS on a Mecanum robot model is carried out which complies with the theoretical results.
Non-overshooting output shaping for switched linear systems under arbitrary switching using eigenstructure assignment
We consider the analytical control design for a pair of switched linear multiple-input multiple-output (MIMO) systems that are subject to arbitrary switching signals. A state feedback controller design method is proposed to obtain an eigenstructure assignment that ensures that the closed-loop switched system is globally asymptotically stable, and the outputs achieve the non-overshooting tracking of a step reference. Our analysis indicates whether non-overshooting or even monotonic tracking is achievable for the given system and considered outputs and provides a choice of possible eigenstructures to be assigned to the constituent subsystems. We derive a structural condition that verifies the feasibility of the chosen assignment. A constructive algorithm to obtain suitable feedback matrices is provided, and the method is illustrated with numerical examples.
Data-informed modeling of the formation, persistence, and evolution of social norms and conventions
Social norms and conventions are commonly accepted and adopted behaviors and practices within a social group that guide interactions -- e.g., how to spell a word or how to greet people -- and are central to a group's culture and identity. Understanding the key mechanisms that govern the formation, persistence, and evolution of social norms and conventions in social communities is a problem of paramount importance for a broad range of real-world applications, spanning from preparedness for future emergencies to promotion of sustainable practices. In the past decades, mathematical modeling has emerged as a powerful tool to reproduce and study the complex dynamics of norm and convention change, gaining insights into their mechanisms, and ultimately deriving tools to predict their evolution. The first goal of this chapter is to introduce some of the main mathematical approaches for modeling social norms and conventions, including population models and agent-based models relying on the theories of dynamical systems, evolutionary dynamics, and game theory. The second goal of the chapter is to illustrate how quantitative observations and empirical data can be incorporated into these mathematical models in a systematic manner, establishing a data-based approach to mathematical modeling of formation, persistence, and evolution of social norms and conventions. Finally, current challenges and future opportunities in this growing field of research are discussed.
comment: This is an author's (preprint) version of a book chapter that is part of the Handbook of Visual, Experimental and Computational Mathematics - Bridges through Data
A data-driven approach for safety quantification of non-linear stochastic systems with unknown additive noise distribution
In this paper, we present a novel data-driven approach to quantify safety for non-linear, discrete-time stochastic systems with unknown noise distribution. We define safety as the probability that the system remains in a given region of the state space for a given time horizon and, to quantify it, we present an approach based on Stochastic Barrier Functions (SBFs). In particular, we introduce an inner approximation of the stochastic program to design a SBF in terms of a chance-constrained optimisation problem, which allows us to leverage the scenario approach theory to design a SBF from samples of the system with Probably Approximately Correct (PAC) guarantees. Our approach leads to tractable, robust linear programs, which enable us to assert safety for non-linear models that were otherwise deemed infeasible with existing methods. To further mitigate the computational complexity of our approach, we exploit the structure of the system dynamics and rely on spatial data structures to accelerate the construction and solution of the underlying optimisation problem. We show the efficacy and validity of our framework in several benchmarks, showing that our approach can obtain substantially tighter certificates compared to state-of-the-art with a confidence that is several orders of magnitude higher.
Variations in Multi-Agent Actor-Critic Frameworks for Joint Optimizations in UAV Swarm Networks: Recent Evolution, Challenges, and Directions
Autonomous unmanned aerial vehicle (UAV) swarm networks (UAVSNs) can effectively execute surveillance, connectivity, and computing services to ground users (GUs). These missions require trajectory planning, UAV-GUs association, task offloading, next-hop selection, and resources such as transmit power, bandwidth, caching, and computing allocation to improve network performances. Owing to the highly dynamic topology, limited resources, and non-availability of global knowledge, optimizing network performance in UAVSNs is very intricate. Hence, it requires an adaptive joint optimization framework that can tackle both discrete and continuous decision variables to ensure optimal network performance under dynamic constraints. Multi-agent deep reinforcement learning-based adaptive actor-critic framework can efficiently address these problems. This paper investigates the recent evolutions of actor-critic frameworks to deal with joint optimization problems in UAVSNs. In addition, challenges and potential solutions are addressed as research directions.
Two Birds With One Stone: Enhancing Communication and Sensing via Multi-Functional RIS
In this article, we propose new network architectures that integrate multi-functional reconfigurable intelligent surfaces (MF-RISs) into 6G networks to enhance both communication and sensing capabilities. Firstly, we elaborate how to leverage MF-RISs for improving communication performance in different communication modes including unicast, mulitcast, and broadcast and for different multi-access schemes. Next, we emphasize synergistic benefits of integrating MF-RISs with wireless sensing, enabling more accurate and efficient target detection in 6G networks. Furthermore, we present two schemes that utilize MF-RISs to enhance the performance of integrated sensing and communication (ISAC). We also study multi-objective optimization to achieve the optimal trade-off between communication and sensing performance. Finally, we present numerical results to show the performance improvements offered by MF-RISs compared to conventional RISs in ISAC. We also outline key research directions for MF-RIS under the ambition of 6G.
comment: 8 pages, 5 figures, submitted to IEEE
MPC-guided, Data-driven Fuzzy Controller Synthesis
Model predictive control (MPC) is a powerful control technique for online optimization using system model-based predictions over a finite time horizon. However, the computational cost MPC requires can be prohibitive in resource-constrained computer systems. This paper presents a fuzzy controller synthesis framework guided by MPC. In the proposed framework, training data is obtained from MPC closed-loop simulations and is used to optimize a low computational complexity controller to emulate the response of MPC. In particular, autoregressive moving average (ARMA) controllers are trained using data obtained from MPC closed-loop simulations, such that each ARMA controller emulates the response of the MPC controller under particular desired conditions. Using a Takagi-Sugeno (T-S) fuzzy system, the responses of all the trained ARMA controllers are then weighted depending on the measured system conditions, resulting in the Fuzzy-Autoregressive Moving Average (F-ARMA) controller. The effectiveness of the trained F-ARMA controllers is illustrated via numerical examples.
comment: 8 pages, 8 figures, submitted to the American Control Conference 2025
BiC-MPPI: Goal-Pursuing, Sampling-Based Bidirectional Rollout Clustering Path Integral for Trajectory Optimization
This paper introduces the Bidirectional Clustered MPPI (BiC-MPPI) algorithm, a novel trajectory optimization method aimed at enhancing goal-directed guidance within the Model Predictive Path Integral (MPPI) framework. BiC-MPPI incorporates bidirectional dynamics approximations and a new guide cost mechanism, improving both trajectory planning and goal-reaching performance. By leveraging forward and backward rollouts, the bidirectional approach ensures effective trajectory connections between initial and terminal states, while the guide cost helps discover dynamically feasible paths. Experimental results demonstrate that BiC-MPPI outperforms existing MPPI variants in both 2D and 3D environments, achieving higher success rates and competitive computation times across 900 simulations on a modified BARN dataset for autonomous navigation. GitHub: https://github.com/i-ASL/BiC-MPPI
comment: 7 pages, 1 figures
Cost-Effective Cyber-Physical System Prototype for Precision Agriculture with a Focus on Crop Growth SP 2024
In precision agriculture, integrating advanced technologies is crucial for optimizing plant growth and health monitoring. Cyber-physical system (CPS) platforms tailored to specific agricultural environments have emerged, but the diversity of these environments poses challenges in developing adaptive CPS platforms. This paper explores rapid prototyping methods to address these challenges, focusing on non-destructive techniques for estimating plant growth. We present a CPS prototype that combines sensors, microcontrollers, digital image processing, and predictive modeling to measure leaf area and biomass accumulation in hydroponic environments. Our results show that the prototype effectively monitors and predicts plant growth, highlighting the potential of rapid CPS prototyping in promoting sustainability and improving crop yields at a moderate cost of hardware.
comment: To appear in Proceedings of the 35th IEEE International Workshop on Rapid System Prototyping (RSP 2024)
Efficient Coordination for Distributed Discrete-Event Systems
Timing control while preserving determinism is often a key requirement for ensuring the safety and correctness of distributed cyber-physical systems (CPS). Discrete-event (DE) systems provide a suitable model of computation (MoC) for time-sensitive distributed CPS. The high-level architecture (HLA) is a useful tool for the distributed simulation of DE systems, but its techniques can be adapted for implementing distributed CPS. However, HLA incurs considerable overhead in network messages conveying timing information between the distributed nodes and the centralized run-time infrastructure (RTI). This paper gives a novel approach and implementation that reduces such network messages while preserving DE semantics. An evaluation of our runtime demonstrates that our approach significantly reduces the volume of messages for timing information in HLA.
comment: To appear in Proceedings of the 22nd ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE'24)
Simulating the blood transfusion system in Kenya: Modelling methods and exploratory analyses
The process of collecting blood from donors and making it available for transfusion requires a complex series of operations involving multiple actors and resources at each step. Ensuring hospitals receive adequate and safe blood for transfusion is a common challenge across low- and middle-income countries, but is rarely addressed from a system level. This paper presents the first use of discrete event simulation to study the blood system in Kenya and to explore the effect of variations and perturbations at different steps of the system on meeting patient blood demand. A process map of the Kenyan blood system was developed to capture critical steps from blood donation to transfusion using interviews with blood bank, hospital, and laboratory personnel at four public hospitals across three counties in Kenya. The blood system was simulated starting with blood collection, a blood bank where blood is tested and stored before it is issued, a major hospital attached to the blood bank, and several smaller hospitals served by the same blood bank. Values for supply-side parameters were based mainly on expert opinion; demand-side parameters were based on data from blood requisitions made in hospital wards, and dispatch of blood from the hospital laboratory. Illustrative examples demonstrate how the model can be used to explore the impacts of changes in blood collection (e.g., prioritising different donor types), blood demand (e.g., differing clinical case mix), and blood distribution (e.g., restocking strategies) on meeting demand at patient level. The model can reveal potential process impediments in the blood system and aid in choosing strategies for improving blood collection, distribution or use. Such a systems approach allows for interventions at different steps in the blood continuum to be tested on blood availability for different patients presenting at diverse hospitals across the country.
comment: 38 pages, 8 figures
A Rapid Trajectory Optimization and Control Framework for Resource-Constrained Applications
This paper presents a computationally efficient model predictive control formulation that uses an integral Chebyshev collocation method to enable rapid operations of autonomous agents. By posing the finite-horizon optimal control problem and recursive re-evaluation of the optimal trajectories, minimization of the L2 norms of the state and control errors are transcribed into a quadratic program. Control and state variable constraints are parameterized using Chebyshev polynomials and are accommodated in the optimal trajectory generation programs to incorporate the actuator limits and keepout constraints. Differentiable collision detection of polytopes is leveraged for optimal collision avoidance. Results obtained from the collocation methods are benchmarked against the existing approaches on an edge computer to outline the performance improvements. Finally, collaborative control scenarios involving multi-agent space systems are considered to demonstrate the technical merits of the proposed work.
comment: This work has been submitted to the IEEE ACC 2025 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Learning responsibility allocations for multi-agent interactions: A differentiable optimization approach with control barrier functions
From autonomous driving to package delivery, ensuring safe yet efficient multi-agent interaction is challenging as the interaction dynamics are influenced by hard-to-model factors such as social norms and contextual cues. Understanding these influences can aid in the design and evaluation of socially-aware autonomous agents whose behaviors are aligned with human values. In this work, we seek to codify factors governing safe multi-agent interactions via the lens of responsibility, i.e., an agent's willingness to deviate from their desired control to accommodate safe interaction with others. Specifically, we propose a data-driven modeling approach based on control barrier functions and differentiable optimization that efficiently learns agents' responsibility allocation from data. We demonstrate on synthetic and real-world datasets that we can obtain an interpretable and quantitative understanding of how much agents adjust their behavior to ensure the safety of others given their current environment.
comment: 8 pages, 7 figures
Optimal Attitude Control of Large Flexible Space Structures with Distributed Momentum Actuators
Recent spacecraft mission concepts propose larger payloads that have lighter, less rigid structures. For large lightweight structures, the natural frequencies of their vibration modes may fall within the attitude controller bandwidth, threatening the stability and settling time of the controller and compromising performance. This work tackles this issue by proposing an attitude control design paradigm of distributing momentum actuators throughout the structure to have more control authority over vibration modes. The issue of jitter disturbances introduced by these actuators is addressed by expanding the bandwidth of the attitude controller to suppress excess vibrations. Numerical simulation results show that, at the expense of more control action, a distributed configuration can achieve lower settling times and reduce structural deformation compared to a more standard centralized configuration.
comment: 10 pages, 9 figures
Fabrication-Aware Inverse Design For Shape Optimization
Inverse design (ID) is a computational method that systematically explores a design space to find optimal device geometries based on specific performance criteria. In silicon photonics, ID often leads to devices with design features that degrade significantly due to the fabrication process, limiting the applicability of these devices in scalable silicon photonic fabrication. We demonstrate a solution to this performance degradation through fabrication-aware inverse design (FAID), integrating lithography models for deep-ultraviolet (DUV) lithography and electron beam lithography (EBL) into the shape optimization approach of ID. A Y-branch and an SWG-to-strip converter were generated and fabricated with this new approach. Simulated and measured results verify that the FAID yields devices with up to 0.6 dB lower insertion loss per device. The modified workflow enables designers to use ID to generate devices that adjust for process bias predicted by lithography models.
comment: 4 pages
A neural network-based approach to hybrid systems identification for control
We consider the problem of designing a machine learning-based model of an unknown dynamical system from a finite number of (state-input)-successor state data points, such that the model obtained is also suitable for optimal control design. We adopt a neural network (NN) architecture that, once suitably trained, yields a hybrid system with continuous piecewise-affine (PWA) dynamics that is differentiable with respect to the network's parameters, thereby enabling the use of derivative-based training procedures. We show that a careful choice of our NN's weights produces a hybrid system model with structural properties that are highly favorable when used as part of a finite horizon optimal control problem (OCP). Specifically, we rely on available results to establish that optimal solutions with strong local optimality guarantees can be computed via nonlinear programming (NLP), in contrast to classical OCPs for general hybrid systems which typically require mixed-integer optimization. Besides being well-suited for optimal control design, numerical simulations illustrate that our NN-based technique enjoys very similar performance to state-of-the-art system identification methods for hybrid systems and it is competitive on nonlinear benchmarks.
The Brain-Inspired Cooperative Shared Control Framework for Brain-Machine Interface
In brain-machine interface (BMI) applications, a key challenge is the low information content and high noise level in neural signals, severely affecting stable robotic control. To address this challenge, we proposes a cooperative shared control framework based on brain-inspired intelligence, where control signals are decoded from neural activity, and the robot handles the fine control. This allows for a combination of flexible and adaptive interaction control between the robot and the brain, making intricate human-robot collaboration feasible. The proposed framework utilizes spiking neural networks (SNNs) for controlling robotic arm and wheel, including speed and steering. While full integration of the system remains a future goal, individual modules for robotic arm control, object tracking, and map generation have been successfully implemented. The framework is expected to significantly enhance the performance of BMI. In practical settings, the BMI with cooperative shared control, utilizing a brain-inspired algorithm, will greatly enhance the potential for clinical applications.
comment: This article need to update the corrected figure and content
Angular Spread Statistics for 6.75 GHz FR1(C) and 16.95 GHz FR3 Mid-Band Frequencies in an Indoor Hotspot Environment
We present detailed multipath propagation spatial statistics for next-generation wireless systems operating at lower and upper mid-band frequencies spanning 6--24 GHz. The large-scale spatial characteristics of the wireless channel include Azimuth angular Spread of Departure (ASD) and Zenith angular Spread of Departure (ZSD) of multipath components (MPC) from a transmitter and the Azimuth angular Spread of Arrival (ASA) and Zenith angular Spread of Arrival (ZSA) at a receiver. The angular statistics calculated from measurements were compared with industry-standard 3GPP models, and ASD and ASA values were found to be in close agreement at both 6.75 GHz and 16.95 GHz. Measured LOS ASD was found larger than 3GPP ASD indicating more diverse MPC departure directions in the azimuth. ZSA and ZSD were observed smaller than the 3GPP modeling results as most multipath arrivals and departures during measurements were recorded at the boresight antenna elevation. The wide angular spreads indicate a multipath-rich spatial propagation at 6.75 GHz and 16.95 GHz, showing greater promise for the implementation of MIMO beamforming systems in the mid-band spectrum.
comment: 6 pages, 3 figures, 1 table, IEEE Wireless Communications and Networking Conference
The Power-Oriented Graphs Modeling Technique: From the Fundamental Principles to the Systematic, Step-by-Step Modeling of Complex Physical Systems
Modeling physical systems is an essential skill for a control engineer, since it enables to achieve a deep understanding of their dynamic behavior and, consequently, the development of effective control strategies. The first part of this article provides a tutorial description of the fundamental principles and properties of the Power-Oriented Graphs (POG) modeling technique. Various case studies in different energetic domains are then presented to consolidate the fundamental principles, each highlighting different features of the POG modeling technique. The latter is then compared with the other two main graphical modeling techniques available in the literature, namely Bond Graph (BG) and Energetic Macroscopic Representation (EMR). The second part of this article assumes once again a tutorial nature, in order to introduce the new Fast Modeling POG (FMPOG) procedure. The FMPOG, which operates in the POG framework, is a methodical step-by-step procedure that enables the readers to quickly derive the power-oriented graphical model of physical systems starting from their schematics. From the power-oriented graphical model, the state-space model can then be directly determined. To ensure the FMPOG procedure is easily usable by the entire community, we apply it to three examples in different energetic domains in this article, guiding the reader step-by-step through the derivation of the physical systems models. A freely available Matlab/Simulink program is provided in a repository, allowing the users to automatically apply the FMPOG procedure to various classes of physical systems. This program allows to convert the physical systems schematics into the corresponding POG block schemes and, ultimately, into the state-space mathematical models.
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
comment: 16 pages, 17 figures
Predictability and Fairness in Load Aggregation with Deadband
Virtual power plants and load aggregation are becoming increasingly common. There, one regulates the aggregate power output of an ensemble of distributed energy resources (DERs). Marecek et al. [Automatica, Volume 147, January 2023, 110743, arXiv:2110.03001] recently suggested that long-term averages of prices or incentives offered should exist and be independent of the initial states of the operators of the DER, the aggregator, and the power grid. This can be seen as predictability, which underlies fairness. Unfortunately, the existence of such averages cannot be guaranteed with many traditional regulators, including the proportional-integral (PI) regulator with or without deadband. Here, we consider the effects of losses in the alternating current model and the deadband in the controller. This yields a non-linear dynamical system (due to the non-linear losses) exhibiting discontinuities (due to the deadband). We show that Filippov invariant measures enable reasoning about predictability and fairness while considering non-linearity of the alternating-current model and deadband.
comment: This proves ergodic properties superficially similar to arXiv:2110.03001, but for discontinuous dynamical systems, rather than continuous dynamical systems
AI-Native Network Digital Twin for Intelligent Network Management in 6G
As a pivotal virtualization technology, network digital twin is expected to accurately reflect real-time status and abstract features in the on-going sixth generation (6G) networks. In this article, we propose an artificial intelligence (AI)-native network digital twin framework for 6G networks to enable the synergy of AI and network digital twin, thereby facilitating intelligent network management. In the proposed framework, AI models are utilized to establish network digital twin models to facilitate network status prediction, network pattern abstraction, and network management decision-making. Furthermore, potential solutions are proposed for enhance the performance of network digital twin. Finally, a case study is presented, followed by a discussion of open research issues that are essential for AI-native network digital twin in 6G networks.
comment: This article is submitted to IEEE Wireless Communications
Peer-to-Peer Energy Trading of Solar and Energy Storage: A Networked Multiagent Reinforcement Learning Approach
Utilizing distributed renewable and energy storage resources in local distribution networks via peer-to-peer (P2P) energy trading has long been touted as a solution to improve energy systems' resilience and sustainability. Consumers and prosumers (those who have energy generation resources), however, do not have the expertise to engage in repeated P2P trading, and the zero-marginal costs of renewables present challenges in determining fair market prices. To address these issues, we propose multi-agent reinforcement learning (MARL) frameworks to help automate consumers' bidding and management of their solar PV and energy storage resources, under a specific P2P clearing mechanism that utilizes the so-called supply-demand ratio. In addition, we show how the MARL frameworks can integrate physical network constraints to realize voltage control, hence ensuring physical feasibility of the P2P energy trading and paving way for real-world implementations.
Safety Margins for Reinforcement Learning
Any autonomous controller will be unsafe in some situations. The ability to quantitatively identify when these unsafe situations are about to occur is crucial for drawing timely human oversight in, e.g., freight transportation applications. In this work, we demonstrate that the true criticality of an agent's situation can be robustly defined as the mean reduction in reward given some number of random actions. Proxy criticality metrics that are computable in real-time (i.e., without actually simulating the effects of random actions) can be compared to the true criticality, and we show how to leverage these proxy metrics to generate safety margins, which directly tie the consequences of potentially incorrect actions to an anticipated loss in overall performance. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment, and demonstrate how safety margins decrease as agents approach failure states. The integration of safety margins into programs for monitoring deployed agents allows for the real-time identification of potentially catastrophic situations.
comment: 2 pages, 2 figures. Presented at the 2023 IEEE Conference on Artificial Intelligence (CAI), Santa Clara, CA
LEGO: QEC Decoding System Architecture for Dynamic Circuits
Quantum error correction (QEC) is a critical component of FTQC; the QEC decoder is an important part of Classical Computing for Quantum or C4Q. Recent years have seen fast development in real-time QEC decoders. Existing efforts to build real-time decoders have yet to achieve a critical milestone: decoding dynamic logical circuits with error-corrected readout and feed forward. Achieving this requires significant engineering effort to adapt and reconfigure the decoders during runtime, depending on the branching of the logical circuit. We present a QEC decoder architecture called LEGO, with the ambitious goal of supporting dynamic logical operations. LEGO employs a novel abstraction called the decoding block to describe the decoding problem of a dynamic logical circuit. Moreover, decoding blocks can be combined with three other ideas to improve the efficiency, accuracy and latency of the decoder. First, they provide data and task parallelisms when combined with fusion-based decoding. Second, they can exploit the pipeline parallelism inside multi-stage decoders. Finally, they serve as basic units of work for computational resource management. Using decoding blocks, LEGO can be easily reconfigured to support all QEC settings and to easily accommodate innovations in three interdependent fields: code, logical operations and qubit hardware. In contrast, existing decoders are highly specialized to a specific QEC setting, which leads to redundant research and engineering efforts, slows down innovation, and further fragments the nascent quantum computing industry.
Distributed Optimization of Clique-Wise Coupled Problems via Three-Operator Splitting
This study explores distributed optimization problems with clique-wise coupling via operator splitting and how we can utilize this framework for performance analysis and enhancement. This framework extends beyond conventional pairwise coupled problems (e.g., consensus optimization) and is applicable to broader examples. To this end, we first introduce a new distributed optimization algorithm by leveraging a clique-based matrix and the Davis-Yin splitting (DYS), a versatile three-operator splitting method. We then demonstrate that this approach sheds new light on conventional algorithms in the following way: (i) Existing algorithms (NIDS, Exact diffusion, diffusion, and our previous work) can be derived from our proposed method; (ii) We present a new mixing matrix based on clique-wise coupling, which surfaces when deriving the NIDS. We prove its preferable distribution of eigenvalues, enabling fast consensus; (iii) These observations yield a new linear convergence rate for the NIDS with non-smooth objective functions. Remarkably our linear rate is first established for the general DYS with a projection for a subspace. This case is not covered by any prior results, to our knowledge. Finally, numerical examples showcase the efficacy of our proposed approach.
comment: 32 pages
Robust Lateral Control of a Convoy of Autonomous & Connected Vehicles with Limited Preview
This paper addresses the lateral control of Autonomous & Connected Vehicles (ACVs) convoys during Emergency Lane Change (ELC) maneuvers. These maneuvers are initiated in response to emergency cues from either the front or rear of the convoy, responding to the need to avoid obstacles or facilitate the passage of other vehicles. The primary objective of this study is to develop a lateral control scheme for ACVs based on the available information. The foundational assumption in this study is the existence of reliable connectivity among ACVs, wherein each subsequent ACV possesses information concerning the GPS position traces of both the lead and immediately preceding vehicles within the convoy. This connectivity facilitates the construction of a composite ELC trajectory that synthesizes this information, serving as a "discretized" preview of the trajectory to be tracked. The procedural steps include constructing this composite trajectory, determining cross-track error, heading, and yaw rate errors relative to it, and subsequently formulating a lateral control strategy. Furthermore, the paper presents findings on the lateral string stability of ACV convoys across various scenarios, encompassing changes in longitudinal velocity and scenarios where lead vehicle information is unavailable. Numerical and experimental results validate the efficacy of the proposed lateral control scheme for ACV convoys.
comment: 13 pages, 14 figures
Systems and Control (EESS)
Non-linear Control of the Power Injected Into a Weak Grid by a Self-Synchronized Inverter
In this work, a non-linear controller designed using non-linear transformation linearization and feedback is proposed for an inverter connected to a weak grid through a single-stage inductive filter. The proposed strategy is self-synchronized, so that it is not necessary to have a voltage sensor at the Point of Common Coupling (PCC). The strategy allows to robustify, in the presence of a weak grid, a strategy that has already been demonstrated to allow a significant reduction in the size of the DC-link capacitor of the converter. For this purpose, a state observer is designed that allows estimating the voltage at the PCC from the measurement of the output inductor current. A start-up controller is also included, which allows synchronization even in the case of system start-up. Simulation results are presented for different operating cases, including start-up, normal operation, and grid-voltage sags and swells. In all these cases, it is considered that the exact parameters of the grid to which the inverter is connected are unknown.
comment: 8 pages, 5 figures
The Euler-Lagrange equation and optimal control: Preliminary results
Algebraically speaking, linear time-invariant (LTI) systems can be considered as modules. In this framework, controllability is translated as the freeness of the system module. Optimal control mainly relies on quadratic Lagrangians and the consideration of any basis of the system module leads to an open-loop control strategy via a linear Euler-Lagrange equation. In this approach, the endpoint is easily assignable and time horizon can be chosen to minimize the criterion. The loop is closed via an intelligent controller derived from model-free control, which exhibits excellent performances concerning model mismatches and disturbances. The extension to nonlinear systems is briefly discussed.
comment: 12th International Conference on Systems and Control, Batna (Algeria), 3-5 November 2024
An Improved ESO-Based Line-of-Sight Guidance Law for Path Following of Underactuated Autonomous Underwater Helicopter With Nonlinear Tracking Differentiator and Anti-saturation Controller
This paper presents an Improved Extended-state-observer based Line-of-Sight (IELOS) guidance law for path following of underactuated Autonomous Underwater helicopter (AUH) utilizing a nonlinear tracking differentiator and anti-saturation controller. Due to the high mobility of the AUH, the classical reduced-order Extended-State-Observer (ESO) struggles to accurately track the sideslip angle, especially when rapid variation occurs. By incorporating the nonlinear tracking differentiator and anti-saturation controller, the IELOS guidance law can precisely track sideslip angle and mitigate propeller thrust buffet compared to the classical Extended-state-observer based Line-of-Sight (ELOS) guidance law. The performance of ESO is significantly influenced by the bandwidth, with the Improved Extended-State-Observer (IESO) proving effective at low bandwidths where the classical ESO falls short. The paper establishes the input-to-state stability of the closed-loop system. Subsequently, simulation and pool experimental results are showcased to validate the effectiveness of the IELOS guidance law, which outperforms both the Line-of-Sight (LOS) and Adaptive Line-of-Sight (ALOS) guidance laws in terms of performance.
Structure and Control of Biology-inspired Networks
There is increasing interest in developing the theoretical foundations of networked control systems that illuminate how brain networks function so as to enable sensory perception, control of movement, memory and all the operations that are needed for animals to survive. The present paper proposes a biologically inspired network model featuring dynamic connections regulated by Hebbian learning. Drawing on the machinery of graph theory and classical control we show that our novel nonlinear model exhibits such biologically plausible features as bounded evolution, stability, resilience, and a kind of structural stability -- meaning that perturbations of the model parameters leave the essential properties of the model in tact. The proposed network model involves generalized cactus graphs with multiple control input nodes, and it is shown that the properties of the network are resilient to various changes in network topology provided these changes preserve the generalized cactus structure. A particular example described in what follows is an idealized network model of the visual system of a macaque monkey. The model displays resilience to network disruptions such as might occur in a living organism due to disease or injury. A different model of the same type provides an example of a system that can perform data classification.
comment: 12 pages
Observability rank conditions for analysing practical identifiability a priori
The concept of identifiability describes the possibility of inferring the parameters of a dynamic model by observing its output. It is common and useful to distinguish between structural and practical identifiability. The former property is fully determined by the model equations, while the latter is also influenced by the characteristics of the available experimental data. Structural identifiability can be determined by means of symbolic computations, which may be performed before collecting experimental data, and are hence sometimes called a priori analyses. Practical identifiability is typically assessed numerically, with methods that require simulations - and often also optimization - and are applied a posteriori. An approach to study structural local identifiability is to consider it as a particular case of observability, which is the possibility of inferring the internal state of a system from its output. Thus, both properties can be analysed jointly, by building a generalized observability matrix and computing its rank. The aim of this paper is to investigate to which extent such observability-based methods can also inform about practical identifiability. To this end, we explore a number of possible extensions of the rank tests, and discuss the purposes for which they can be informative as well as others for which they cannot.
comment: 10 pages, 2 figures
Cooperative UAV-Relay based Satellite Aerial Ground Integrated Networks
In the post-fifth generation (5G) era, escalating user quality of service (QoS) strains terrestrial network capacity, especially in urban areas with dynamic traffic distributions. This paper introduces a novel cooperative unmanned aerial vehicle relay-based deployment (CUD) framework in satellite air-ground integrated networks (SAGIN). The CUD strategy deploys an unmanned aerial vehicle-based relay (UAVr) in an amplify-andforward (AF) mode to enhance user QoS when terrestrial base stations fall short of network capacity. By combining low earth orbit (LEO) satellite and UAVr signals using cooperative diversity, the CUD framework enhances the signal to noise ratio (SNR) at the user. Comparative evaluations against existing frameworks reveal performance improvements, demonstrating the effectiveness of the CUD framework in addressing the evolving demands of next-generation networks.
comment: 5 pages, 3 figures, to appear in IEEE 100th Vehicular Technology Conference (VTC2024-Fall)
Stabilization of Predator-Prey Age-Structured Hyperbolic PDE when Harvesting both Species is Inevitable
Populations do not only interact over time but also age over time. It is therefore common to model them as age-structured PDEs, where age is the space variable. Since the models also involve integrals over age, both in the birth process and in the interaction among species, they are in fact integro-partial differential equations (IPDEs) with positive states. To regulate the population densities to desired profiles, harvesting is used as input. But non-discriminating harvesting, where wanting to repress one species will inevitably repress the other species as well, the positivity restriction on the input (no insertion of population), and the multiplicative nature of harvesting, makes control challenging even for ODE versions of such dynamics, let alone for their IPDE versions on an infinite-dimensional nonnegative state space. We introduce a design for a benchmark version of such a problem: a two-population predator-prey setup. The model is equivalent to two coupled ordinary differential equations (ODEs), actuated by harvesting which must not drop below zero, and strongly disturbed by two autonomous but exponentially stable integral delay equations (IDEs). We develop two control designs. With a modified Volterra-like control Lyapunov function, we design a simple feedback which employs possibly negative harvesting for global stabilization of the ODE model, while guaranteeing regional regulation with positive harvesting. With a more sophisticated, restrained controller we achieve regulation for the ODE model globally, with positive harvesting. For the full IPDE model, with the IDE dynamics acting as large disturbances, for both the simple and saturated feedback laws we provide explicit estimates of the regions of attraction. The paper charts a new pathway for control designs for infinite-dimensional multi-species dynamics and for nonlinear positive systems with positive controls.
comment: submitted to IEEE Transactions on Automatic Control
A Hybrid Renewable-Battery-Electrolyzer Facility under the Single Imbalance Pricing Scheme
European energy markets are decentralized and entail balance responsibility of each market player. This stresses the importance of imbalance management of renewable energy sources (RES), as the imbalance payments can strongly reduce their profitability. According to the EU Electricity Balancing Guideline, each European transmission system operator should use the single imbalance pricing method which treats both deviation directions the same, no matter if a deviation helps the system or pushes it away from the balance. This paper aims to investigate the behavior of a hybrid facility consisting of an uncontrollable RES, a battery and an electrolyzer under such market setting. The formulated mathematical model of the hybrid facility seeks to maximize profit in the day-ahead energy market, while minimizing the imbalance costs. Uncertainty of the RES output is captured using stochastic scenarios, while the direction of the power system deviation, relevant for the imbalance pricing, is modeled using a newly proposed robust approach. Results of the case study indicate that the single imbalance pricing scheme might bring flexible assets to temptation of intentional deviations should they anticipate favorable imbalance prices.
Safe and High-Performance Learning of Model Predicitve Control using Kernel-Based Interpolation
We present a method, which allows efficient and safe approximation of model predictive controllers using kernel interpolation. Since the computational complexity of the approximating function scales linearly with the number of data points, we propose to use a scoring function which chooses the most promising data. To further reduce the complexity of the approximation, we restrict our considerations to the set of closed-loop reachable states. That is, the approximating function only has to be accurate within this set. This makes our method especially suited for systems, where the set of initial conditions is small. In order to guarantee safety and high performance of the designed approximated controller, we use reachability analysis based on Monte Carlo methods.
Finite-Time Trajectory Tracking of a Four wheeled Mecanum Mobile Robot
Four Wheeled Mecanum Robot (FWMR) possess the capability to move in any direction on a plane making it a cornerstone system in modern industrial operations. Despite the extreme maneuverability offered by FWMR, the practical implementation or real-time simulation of Mecanum wheel robots encounters substantial challenges in trajectory tracking control. In this research work, we present a finite-time control law using backstepping technique to perform stabilization and trajectory tracking objectives for a FWMR system. A rigorous stability proof is presented and explicit computation of the finite-time is provided. For tracking objective, we demonstrate the results taking an S-shaped trajectory inclined towards collision avoidance applications. Simulation validation in real time using Gazebo-ROS on a Mecanum robot model is carried out which complies with the theoretical results.
Non-overshooting output shaping for switched linear systems under arbitrary switching using eigenstructure assignment
We consider the analytical control design for a pair of switched linear multiple-input multiple-output (MIMO) systems that are subject to arbitrary switching signals. A state feedback controller design method is proposed to obtain an eigenstructure assignment that ensures that the closed-loop switched system is globally asymptotically stable, and the outputs achieve the non-overshooting tracking of a step reference. Our analysis indicates whether non-overshooting or even monotonic tracking is achievable for the given system and considered outputs and provides a choice of possible eigenstructures to be assigned to the constituent subsystems. We derive a structural condition that verifies the feasibility of the chosen assignment. A constructive algorithm to obtain suitable feedback matrices is provided, and the method is illustrated with numerical examples.
Data-informed modeling of the formation, persistence, and evolution of social norms and conventions
Social norms and conventions are commonly accepted and adopted behaviors and practices within a social group that guide interactions -- e.g., how to spell a word or how to greet people -- and are central to a group's culture and identity. Understanding the key mechanisms that govern the formation, persistence, and evolution of social norms and conventions in social communities is a problem of paramount importance for a broad range of real-world applications, spanning from preparedness for future emergencies to promotion of sustainable practices. In the past decades, mathematical modeling has emerged as a powerful tool to reproduce and study the complex dynamics of norm and convention change, gaining insights into their mechanisms, and ultimately deriving tools to predict their evolution. The first goal of this chapter is to introduce some of the main mathematical approaches for modeling social norms and conventions, including population models and agent-based models relying on the theories of dynamical systems, evolutionary dynamics, and game theory. The second goal of the chapter is to illustrate how quantitative observations and empirical data can be incorporated into these mathematical models in a systematic manner, establishing a data-based approach to mathematical modeling of formation, persistence, and evolution of social norms and conventions. Finally, current challenges and future opportunities in this growing field of research are discussed.
comment: This is an author's (preprint) version of a book chapter that is part of the Handbook of Visual, Experimental and Computational Mathematics - Bridges through Data
A data-driven approach for safety quantification of non-linear stochastic systems with unknown additive noise distribution
In this paper, we present a novel data-driven approach to quantify safety for non-linear, discrete-time stochastic systems with unknown noise distribution. We define safety as the probability that the system remains in a given region of the state space for a given time horizon and, to quantify it, we present an approach based on Stochastic Barrier Functions (SBFs). In particular, we introduce an inner approximation of the stochastic program to design a SBF in terms of a chance-constrained optimisation problem, which allows us to leverage the scenario approach theory to design a SBF from samples of the system with Probably Approximately Correct (PAC) guarantees. Our approach leads to tractable, robust linear programs, which enable us to assert safety for non-linear models that were otherwise deemed infeasible with existing methods. To further mitigate the computational complexity of our approach, we exploit the structure of the system dynamics and rely on spatial data structures to accelerate the construction and solution of the underlying optimisation problem. We show the efficacy and validity of our framework in several benchmarks, showing that our approach can obtain substantially tighter certificates compared to state-of-the-art with a confidence that is several orders of magnitude higher.
Variations in Multi-Agent Actor-Critic Frameworks for Joint Optimizations in UAV Swarm Networks: Recent Evolution, Challenges, and Directions
Autonomous unmanned aerial vehicle (UAV) swarm networks (UAVSNs) can effectively execute surveillance, connectivity, and computing services to ground users (GUs). These missions require trajectory planning, UAV-GUs association, task offloading, next-hop selection, and resources such as transmit power, bandwidth, caching, and computing allocation to improve network performances. Owing to the highly dynamic topology, limited resources, and non-availability of global knowledge, optimizing network performance in UAVSNs is very intricate. Hence, it requires an adaptive joint optimization framework that can tackle both discrete and continuous decision variables to ensure optimal network performance under dynamic constraints. Multi-agent deep reinforcement learning-based adaptive actor-critic framework can efficiently address these problems. This paper investigates the recent evolutions of actor-critic frameworks to deal with joint optimization problems in UAVSNs. In addition, challenges and potential solutions are addressed as research directions.
Two Birds With One Stone: Enhancing Communication and Sensing via Multi-Functional RIS
In this article, we propose new network architectures that integrate multi-functional reconfigurable intelligent surfaces (MF-RISs) into 6G networks to enhance both communication and sensing capabilities. Firstly, we elaborate how to leverage MF-RISs for improving communication performance in different communication modes including unicast, mulitcast, and broadcast and for different multi-access schemes. Next, we emphasize synergistic benefits of integrating MF-RISs with wireless sensing, enabling more accurate and efficient target detection in 6G networks. Furthermore, we present two schemes that utilize MF-RISs to enhance the performance of integrated sensing and communication (ISAC). We also study multi-objective optimization to achieve the optimal trade-off between communication and sensing performance. Finally, we present numerical results to show the performance improvements offered by MF-RISs compared to conventional RISs in ISAC. We also outline key research directions for MF-RIS under the ambition of 6G.
comment: 8 pages, 5 figures, submitted to IEEE
MPC-guided, Data-driven Fuzzy Controller Synthesis
Model predictive control (MPC) is a powerful control technique for online optimization using system model-based predictions over a finite time horizon. However, the computational cost MPC requires can be prohibitive in resource-constrained computer systems. This paper presents a fuzzy controller synthesis framework guided by MPC. In the proposed framework, training data is obtained from MPC closed-loop simulations and is used to optimize a low computational complexity controller to emulate the response of MPC. In particular, autoregressive moving average (ARMA) controllers are trained using data obtained from MPC closed-loop simulations, such that each ARMA controller emulates the response of the MPC controller under particular desired conditions. Using a Takagi-Sugeno (T-S) fuzzy system, the responses of all the trained ARMA controllers are then weighted depending on the measured system conditions, resulting in the Fuzzy-Autoregressive Moving Average (F-ARMA) controller. The effectiveness of the trained F-ARMA controllers is illustrated via numerical examples.
comment: 8 pages, 8 figures, submitted to the American Control Conference 2025
BiC-MPPI: Goal-Pursuing, Sampling-Based Bidirectional Rollout Clustering Path Integral for Trajectory Optimization
This paper introduces the Bidirectional Clustered MPPI (BiC-MPPI) algorithm, a novel trajectory optimization method aimed at enhancing goal-directed guidance within the Model Predictive Path Integral (MPPI) framework. BiC-MPPI incorporates bidirectional dynamics approximations and a new guide cost mechanism, improving both trajectory planning and goal-reaching performance. By leveraging forward and backward rollouts, the bidirectional approach ensures effective trajectory connections between initial and terminal states, while the guide cost helps discover dynamically feasible paths. Experimental results demonstrate that BiC-MPPI outperforms existing MPPI variants in both 2D and 3D environments, achieving higher success rates and competitive computation times across 900 simulations on a modified BARN dataset for autonomous navigation. GitHub: https://github.com/i-ASL/BiC-MPPI
comment: 7 pages, 1 figures
Cost-Effective Cyber-Physical System Prototype for Precision Agriculture with a Focus on Crop Growth SP 2024
In precision agriculture, integrating advanced technologies is crucial for optimizing plant growth and health monitoring. Cyber-physical system (CPS) platforms tailored to specific agricultural environments have emerged, but the diversity of these environments poses challenges in developing adaptive CPS platforms. This paper explores rapid prototyping methods to address these challenges, focusing on non-destructive techniques for estimating plant growth. We present a CPS prototype that combines sensors, microcontrollers, digital image processing, and predictive modeling to measure leaf area and biomass accumulation in hydroponic environments. Our results show that the prototype effectively monitors and predicts plant growth, highlighting the potential of rapid CPS prototyping in promoting sustainability and improving crop yields at a moderate cost of hardware.
comment: To appear in Proceedings of the 35th IEEE International Workshop on Rapid System Prototyping (RSP 2024)
Efficient Coordination for Distributed Discrete-Event Systems
Timing control while preserving determinism is often a key requirement for ensuring the safety and correctness of distributed cyber-physical systems (CPS). Discrete-event (DE) systems provide a suitable model of computation (MoC) for time-sensitive distributed CPS. The high-level architecture (HLA) is a useful tool for the distributed simulation of DE systems, but its techniques can be adapted for implementing distributed CPS. However, HLA incurs considerable overhead in network messages conveying timing information between the distributed nodes and the centralized run-time infrastructure (RTI). This paper gives a novel approach and implementation that reduces such network messages while preserving DE semantics. An evaluation of our runtime demonstrates that our approach significantly reduces the volume of messages for timing information in HLA.
comment: To appear in Proceedings of the 22nd ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE'24)
Simulating the blood transfusion system in Kenya: Modelling methods and exploratory analyses
The process of collecting blood from donors and making it available for transfusion requires a complex series of operations involving multiple actors and resources at each step. Ensuring hospitals receive adequate and safe blood for transfusion is a common challenge across low- and middle-income countries, but is rarely addressed from a system level. This paper presents the first use of discrete event simulation to study the blood system in Kenya and to explore the effect of variations and perturbations at different steps of the system on meeting patient blood demand. A process map of the Kenyan blood system was developed to capture critical steps from blood donation to transfusion using interviews with blood bank, hospital, and laboratory personnel at four public hospitals across three counties in Kenya. The blood system was simulated starting with blood collection, a blood bank where blood is tested and stored before it is issued, a major hospital attached to the blood bank, and several smaller hospitals served by the same blood bank. Values for supply-side parameters were based mainly on expert opinion; demand-side parameters were based on data from blood requisitions made in hospital wards, and dispatch of blood from the hospital laboratory. Illustrative examples demonstrate how the model can be used to explore the impacts of changes in blood collection (e.g., prioritising different donor types), blood demand (e.g., differing clinical case mix), and blood distribution (e.g., restocking strategies) on meeting demand at patient level. The model can reveal potential process impediments in the blood system and aid in choosing strategies for improving blood collection, distribution or use. Such a systems approach allows for interventions at different steps in the blood continuum to be tested on blood availability for different patients presenting at diverse hospitals across the country.
comment: 38 pages, 8 figures
A Rapid Trajectory Optimization and Control Framework for Resource-Constrained Applications
This paper presents a computationally efficient model predictive control formulation that uses an integral Chebyshev collocation method to enable rapid operations of autonomous agents. By posing the finite-horizon optimal control problem and recursive re-evaluation of the optimal trajectories, minimization of the L2 norms of the state and control errors are transcribed into a quadratic program. Control and state variable constraints are parameterized using Chebyshev polynomials and are accommodated in the optimal trajectory generation programs to incorporate the actuator limits and keepout constraints. Differentiable collision detection of polytopes is leveraged for optimal collision avoidance. Results obtained from the collocation methods are benchmarked against the existing approaches on an edge computer to outline the performance improvements. Finally, collaborative control scenarios involving multi-agent space systems are considered to demonstrate the technical merits of the proposed work.
comment: This work has been submitted to the IEEE ACC 2025 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Learning responsibility allocations for multi-agent interactions: A differentiable optimization approach with control barrier functions
From autonomous driving to package delivery, ensuring safe yet efficient multi-agent interaction is challenging as the interaction dynamics are influenced by hard-to-model factors such as social norms and contextual cues. Understanding these influences can aid in the design and evaluation of socially-aware autonomous agents whose behaviors are aligned with human values. In this work, we seek to codify factors governing safe multi-agent interactions via the lens of responsibility, i.e., an agent's willingness to deviate from their desired control to accommodate safe interaction with others. Specifically, we propose a data-driven modeling approach based on control barrier functions and differentiable optimization that efficiently learns agents' responsibility allocation from data. We demonstrate on synthetic and real-world datasets that we can obtain an interpretable and quantitative understanding of how much agents adjust their behavior to ensure the safety of others given their current environment.
comment: 8 pages, 7 figures
Optimal Attitude Control of Large Flexible Space Structures with Distributed Momentum Actuators
Recent spacecraft mission concepts propose larger payloads that have lighter, less rigid structures. For large lightweight structures, the natural frequencies of their vibration modes may fall within the attitude controller bandwidth, threatening the stability and settling time of the controller and compromising performance. This work tackles this issue by proposing an attitude control design paradigm of distributing momentum actuators throughout the structure to have more control authority over vibration modes. The issue of jitter disturbances introduced by these actuators is addressed by expanding the bandwidth of the attitude controller to suppress excess vibrations. Numerical simulation results show that, at the expense of more control action, a distributed configuration can achieve lower settling times and reduce structural deformation compared to a more standard centralized configuration.
comment: 10 pages, 9 figures
Fabrication-Aware Inverse Design For Shape Optimization
Inverse design (ID) is a computational method that systematically explores a design space to find optimal device geometries based on specific performance criteria. In silicon photonics, ID often leads to devices with design features that degrade significantly due to the fabrication process, limiting the applicability of these devices in scalable silicon photonic fabrication. We demonstrate a solution to this performance degradation through fabrication-aware inverse design (FAID), integrating lithography models for deep-ultraviolet (DUV) lithography and electron beam lithography (EBL) into the shape optimization approach of ID. A Y-branch and an SWG-to-strip converter were generated and fabricated with this new approach. Simulated and measured results verify that the FAID yields devices with up to 0.6 dB lower insertion loss per device. The modified workflow enables designers to use ID to generate devices that adjust for process bias predicted by lithography models.
comment: 4 pages
A neural network-based approach to hybrid systems identification for control
We consider the problem of designing a machine learning-based model of an unknown dynamical system from a finite number of (state-input)-successor state data points, such that the model obtained is also suitable for optimal control design. We adopt a neural network (NN) architecture that, once suitably trained, yields a hybrid system with continuous piecewise-affine (PWA) dynamics that is differentiable with respect to the network's parameters, thereby enabling the use of derivative-based training procedures. We show that a careful choice of our NN's weights produces a hybrid system model with structural properties that are highly favorable when used as part of a finite horizon optimal control problem (OCP). Specifically, we rely on available results to establish that optimal solutions with strong local optimality guarantees can be computed via nonlinear programming (NLP), in contrast to classical OCPs for general hybrid systems which typically require mixed-integer optimization. Besides being well-suited for optimal control design, numerical simulations illustrate that our NN-based technique enjoys very similar performance to state-of-the-art system identification methods for hybrid systems and it is competitive on nonlinear benchmarks.
The Brain-Inspired Cooperative Shared Control Framework for Brain-Machine Interface
In brain-machine interface (BMI) applications, a key challenge is the low information content and high noise level in neural signals, severely affecting stable robotic control. To address this challenge, we proposes a cooperative shared control framework based on brain-inspired intelligence, where control signals are decoded from neural activity, and the robot handles the fine control. This allows for a combination of flexible and adaptive interaction control between the robot and the brain, making intricate human-robot collaboration feasible. The proposed framework utilizes spiking neural networks (SNNs) for controlling robotic arm and wheel, including speed and steering. While full integration of the system remains a future goal, individual modules for robotic arm control, object tracking, and map generation have been successfully implemented. The framework is expected to significantly enhance the performance of BMI. In practical settings, the BMI with cooperative shared control, utilizing a brain-inspired algorithm, will greatly enhance the potential for clinical applications.
comment: This article need to update the corrected figure and content
Angular Spread Statistics for 6.75 GHz FR1(C) and 16.95 GHz FR3 Mid-Band Frequencies in an Indoor Hotspot Environment
We present detailed multipath propagation spatial statistics for next-generation wireless systems operating at lower and upper mid-band frequencies spanning 6--24 GHz. The large-scale spatial characteristics of the wireless channel include Azimuth angular Spread of Departure (ASD) and Zenith angular Spread of Departure (ZSD) of multipath components (MPC) from a transmitter and the Azimuth angular Spread of Arrival (ASA) and Zenith angular Spread of Arrival (ZSA) at a receiver. The angular statistics calculated from measurements were compared with industry-standard 3GPP models, and ASD and ASA values were found to be in close agreement at both 6.75 GHz and 16.95 GHz. Measured LOS ASD was found larger than 3GPP ASD indicating more diverse MPC departure directions in the azimuth. ZSA and ZSD were observed smaller than the 3GPP modeling results as most multipath arrivals and departures during measurements were recorded at the boresight antenna elevation. The wide angular spreads indicate a multipath-rich spatial propagation at 6.75 GHz and 16.95 GHz, showing greater promise for the implementation of MIMO beamforming systems in the mid-band spectrum.
comment: 6 pages, 3 figures, 1 table, IEEE Wireless Communications and Networking Conference
The Power-Oriented Graphs Modeling Technique: From the Fundamental Principles to the Systematic, Step-by-Step Modeling of Complex Physical Systems
Modeling physical systems is an essential skill for a control engineer, since it enables to achieve a deep understanding of their dynamic behavior and, consequently, the development of effective control strategies. The first part of this article provides a tutorial description of the fundamental principles and properties of the Power-Oriented Graphs (POG) modeling technique. Various case studies in different energetic domains are then presented to consolidate the fundamental principles, each highlighting different features of the POG modeling technique. The latter is then compared with the other two main graphical modeling techniques available in the literature, namely Bond Graph (BG) and Energetic Macroscopic Representation (EMR). The second part of this article assumes once again a tutorial nature, in order to introduce the new Fast Modeling POG (FMPOG) procedure. The FMPOG, which operates in the POG framework, is a methodical step-by-step procedure that enables the readers to quickly derive the power-oriented graphical model of physical systems starting from their schematics. From the power-oriented graphical model, the state-space model can then be directly determined. To ensure the FMPOG procedure is easily usable by the entire community, we apply it to three examples in different energetic domains in this article, guiding the reader step-by-step through the derivation of the physical systems models. A freely available Matlab/Simulink program is provided in a repository, allowing the users to automatically apply the FMPOG procedure to various classes of physical systems. This program allows to convert the physical systems schematics into the corresponding POG block schemes and, ultimately, into the state-space mathematical models.
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
comment: 16 pages, 17 figures
Predictability and Fairness in Load Aggregation with Deadband
Virtual power plants and load aggregation are becoming increasingly common. There, one regulates the aggregate power output of an ensemble of distributed energy resources (DERs). Marecek et al. [Automatica, Volume 147, January 2023, 110743, arXiv:2110.03001] recently suggested that long-term averages of prices or incentives offered should exist and be independent of the initial states of the operators of the DER, the aggregator, and the power grid. This can be seen as predictability, which underlies fairness. Unfortunately, the existence of such averages cannot be guaranteed with many traditional regulators, including the proportional-integral (PI) regulator with or without deadband. Here, we consider the effects of losses in the alternating current model and the deadband in the controller. This yields a non-linear dynamical system (due to the non-linear losses) exhibiting discontinuities (due to the deadband). We show that Filippov invariant measures enable reasoning about predictability and fairness while considering non-linearity of the alternating-current model and deadband.
comment: This proves ergodic properties superficially similar to arXiv:2110.03001, but for discontinuous dynamical systems, rather than continuous dynamical systems
AI-Native Network Digital Twin for Intelligent Network Management in 6G
As a pivotal virtualization technology, network digital twin is expected to accurately reflect real-time status and abstract features in the on-going sixth generation (6G) networks. In this article, we propose an artificial intelligence (AI)-native network digital twin framework for 6G networks to enable the synergy of AI and network digital twin, thereby facilitating intelligent network management. In the proposed framework, AI models are utilized to establish network digital twin models to facilitate network status prediction, network pattern abstraction, and network management decision-making. Furthermore, potential solutions are proposed for enhance the performance of network digital twin. Finally, a case study is presented, followed by a discussion of open research issues that are essential for AI-native network digital twin in 6G networks.
comment: This article is submitted to IEEE Wireless Communications
Peer-to-Peer Energy Trading of Solar and Energy Storage: A Networked Multiagent Reinforcement Learning Approach
Utilizing distributed renewable and energy storage resources in local distribution networks via peer-to-peer (P2P) energy trading has long been touted as a solution to improve energy systems' resilience and sustainability. Consumers and prosumers (those who have energy generation resources), however, do not have the expertise to engage in repeated P2P trading, and the zero-marginal costs of renewables present challenges in determining fair market prices. To address these issues, we propose multi-agent reinforcement learning (MARL) frameworks to help automate consumers' bidding and management of their solar PV and energy storage resources, under a specific P2P clearing mechanism that utilizes the so-called supply-demand ratio. In addition, we show how the MARL frameworks can integrate physical network constraints to realize voltage control, hence ensuring physical feasibility of the P2P energy trading and paving way for real-world implementations.
Safety Margins for Reinforcement Learning
Any autonomous controller will be unsafe in some situations. The ability to quantitatively identify when these unsafe situations are about to occur is crucial for drawing timely human oversight in, e.g., freight transportation applications. In this work, we demonstrate that the true criticality of an agent's situation can be robustly defined as the mean reduction in reward given some number of random actions. Proxy criticality metrics that are computable in real-time (i.e., without actually simulating the effects of random actions) can be compared to the true criticality, and we show how to leverage these proxy metrics to generate safety margins, which directly tie the consequences of potentially incorrect actions to an anticipated loss in overall performance. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment, and demonstrate how safety margins decrease as agents approach failure states. The integration of safety margins into programs for monitoring deployed agents allows for the real-time identification of potentially catastrophic situations.
comment: 2 pages, 2 figures. Presented at the 2023 IEEE Conference on Artificial Intelligence (CAI), Santa Clara, CA
LEGO: QEC Decoding System Architecture for Dynamic Circuits
Quantum error correction (QEC) is a critical component of FTQC; the QEC decoder is an important part of Classical Computing for Quantum or C4Q. Recent years have seen fast development in real-time QEC decoders. Existing efforts to build real-time decoders have yet to achieve a critical milestone: decoding dynamic logical circuits with error-corrected readout and feed forward. Achieving this requires significant engineering effort to adapt and reconfigure the decoders during runtime, depending on the branching of the logical circuit. We present a QEC decoder architecture called LEGO, with the ambitious goal of supporting dynamic logical operations. LEGO employs a novel abstraction called the decoding block to describe the decoding problem of a dynamic logical circuit. Moreover, decoding blocks can be combined with three other ideas to improve the efficiency, accuracy and latency of the decoder. First, they provide data and task parallelisms when combined with fusion-based decoding. Second, they can exploit the pipeline parallelism inside multi-stage decoders. Finally, they serve as basic units of work for computational resource management. Using decoding blocks, LEGO can be easily reconfigured to support all QEC settings and to easily accommodate innovations in three interdependent fields: code, logical operations and qubit hardware. In contrast, existing decoders are highly specialized to a specific QEC setting, which leads to redundant research and engineering efforts, slows down innovation, and further fragments the nascent quantum computing industry.
Distributed Optimization of Clique-Wise Coupled Problems via Three-Operator Splitting
This study explores distributed optimization problems with clique-wise coupling via operator splitting and how we can utilize this framework for performance analysis and enhancement. This framework extends beyond conventional pairwise coupled problems (e.g., consensus optimization) and is applicable to broader examples. To this end, we first introduce a new distributed optimization algorithm by leveraging a clique-based matrix and the Davis-Yin splitting (DYS), a versatile three-operator splitting method. We then demonstrate that this approach sheds new light on conventional algorithms in the following way: (i) Existing algorithms (NIDS, Exact diffusion, diffusion, and our previous work) can be derived from our proposed method; (ii) We present a new mixing matrix based on clique-wise coupling, which surfaces when deriving the NIDS. We prove its preferable distribution of eigenvalues, enabling fast consensus; (iii) These observations yield a new linear convergence rate for the NIDS with non-smooth objective functions. Remarkably our linear rate is first established for the general DYS with a projection for a subspace. This case is not covered by any prior results, to our knowledge. Finally, numerical examples showcase the efficacy of our proposed approach.
comment: 32 pages
Robust Lateral Control of a Convoy of Autonomous & Connected Vehicles with Limited Preview
This paper addresses the lateral control of Autonomous & Connected Vehicles (ACVs) convoys during Emergency Lane Change (ELC) maneuvers. These maneuvers are initiated in response to emergency cues from either the front or rear of the convoy, responding to the need to avoid obstacles or facilitate the passage of other vehicles. The primary objective of this study is to develop a lateral control scheme for ACVs based on the available information. The foundational assumption in this study is the existence of reliable connectivity among ACVs, wherein each subsequent ACV possesses information concerning the GPS position traces of both the lead and immediately preceding vehicles within the convoy. This connectivity facilitates the construction of a composite ELC trajectory that synthesizes this information, serving as a "discretized" preview of the trajectory to be tracked. The procedural steps include constructing this composite trajectory, determining cross-track error, heading, and yaw rate errors relative to it, and subsequently formulating a lateral control strategy. Furthermore, the paper presents findings on the lateral string stability of ACV convoys across various scenarios, encompassing changes in longitudinal velocity and scenarios where lead vehicle information is unavailable. Numerical and experimental results validate the efficacy of the proposed lateral control scheme for ACV convoys.
comment: 13 pages, 14 figures
Robotics
BEVLoc: Cross-View Localization and Matching via Birds-Eye-View Synthesis IROS 2024
Ground to aerial matching is a crucial and challenging task in outdoor robotics, particularly when GPS is absent or unreliable. Structures like buildings or large dense forests create interference, requiring GNSS replacements for global positioning estimates. The true difficulty lies in reconciling the perspective difference between the ground and air images for acceptable localization. Taking inspiration from the autonomous driving community, we propose a novel framework for synthesizing a birds-eye-view (BEV) scene representation to match and localize against an aerial map in off-road environments. We leverage contrastive learning with domain specific hard negative mining to train a network to learn similar representations between the synthesized BEV and the aerial map. During inference, BEVLoc guides the identification of the most probable locations within the aerial map through a coarse-to-fine matching strategy. Our results demonstrate promising initial outcomes in extremely difficult forest environments with limited semantic diversity. We analyze our model's performance for coarse and fine matching, assessing both the raw matching capability of our model and its performance as a GNSS replacement. Our work delves into off-road map localization while establishing a foundational baseline for future developments in localization. Our code is available at: https://github.com/rpl-cmu/bevloc
comment: 8 pages, 6 figures, Conference: IROS 2024
Trajectory Improvement and Reward Learning from Comparative Language Feedback
Learning from human feedback has gained traction in fields like robotics and natural language processing in recent years. While prior works mostly rely on human feedback in the form of comparisons, language is a preferable modality that provides more informative insights into user preferences. In this work, we aim to incorporate comparative language feedback to iteratively improve robot trajectories and to learn reward functions that encode human preferences. To achieve this goal, we learn a shared latent space that integrates trajectory data and language feedback, and subsequently leverage the learned latent space to improve trajectories and learn human preferences. To the best of our knowledge, we are the first to incorporate comparative language feedback into reward learning. Our simulation experiments demonstrate the effectiveness of the learned latent space and the success of our learning algorithms. We also conduct human subject studies that show our reward learning algorithm achieves a 23.9% higher subjective score on average and is 11.3% more time-efficient compared to preference-based reward learning, underscoring the superior performance of our method. Our website is at https://liralab.usc.edu/comparative-language-feedback/
comment: 8th Annual Conference of Robot Learning (2024)
Adver-City: Open-Source Multi-Modal Dataset for Collaborative Perception Under Adverse Weather Conditions
Adverse weather conditions pose a significant challenge to the widespread adoption of Autonomous Vehicles (AVs) by impacting sensors like LiDARs and cameras. Even though Collaborative Perception (CP) improves AV perception in difficult conditions, existing CP datasets lack adverse weather conditions. To address this, we introduce Adver-City, the first open-source synthetic CP dataset focused on adverse weather conditions. Simulated in CARLA with OpenCDA, it contains over 24 thousand frames, over 890 thousand annotations, and 110 unique scenarios across six different weather conditions: clear weather, soft rain, heavy rain, fog, foggy heavy rain and, for the first time in a synthetic CP dataset, glare. It has six object categories including pedestrians and cyclists, and uses data from vehicles and roadside units featuring LiDARs, RGB and semantic segmentation cameras, GNSS, and IMUs. Its scenarios, based on real crash reports, depict the most relevant road configurations for adverse weather and poor visibility conditions, varying in object density, with both dense and sparse scenes, allowing for novel testing conditions of CP models. Benchmarks run on the dataset show that weather conditions created challenging conditions for perception models, reducing multi-modal object detection performance by up to 19%, while object density affected LiDAR-based detection by up to 29%. The dataset, code and documentation are available at https://labs.cs.queensu.ca/quarrg/datasets/adver-city/.
comment: 8 pages
Cooperative and Asynchronous Transformer-based Mission Planning for Heterogeneous Teams of Mobile Robots
Coordinating heterogeneous teams of mobile robots for tasks such as search and rescue is highly challenging. This is due to the complexities of perception, decision making and planning in such environments, with agents' non-synchronous operation, constrained communication, and limited computational resources. This paper presents the Cooperative and Asynchronous Transformer-based Mission Planning (CATMiP) framework, which leverages multi-agent reinforcement learning (MARL) to effectively coordinate agents with heterogeneous sensing, motion, and actuation capabilities. The framework introduces a Class-based Macro-Action Decentralized Partially Observable Markov Decision Process (CMD-POMDP) model to handle asynchronous decision-making among different agent classes via macro-actions. It also extends the Multi-Agent Transformer (MAT) architecture to facilitate distributed, ad hoc communication among the agents. CATMiP easily adapts to mission complexities and communication constraints, and scales to varying environment sizes and team compositions. Simulations demonstrate its scalability and ability to achieve cooperative mission objectives with two classes of explorer and rescuer agents, even under severe communication constraints. The code is available at https://github.com/mylad13/CATMiP.
comment: 8 pages, 7 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Context-Aware Command Understanding for Tabletop Scenarios
This paper presents a novel hybrid algorithm designed to interpret natural human commands in tabletop scenarios. By integrating multiple sources of information, including speech, gestures, and scene context, the system extracts actionable instructions for a robot, identifying relevant objects and actions. The system operates in a zero-shot fashion, without reliance on predefined object models, enabling flexible and adaptive use in various environments. We assess the integration of multiple deep learning models, evaluating their suitability for deployment in real-world robotic setups. Our algorithm performs robustly across different tasks, combining language processing with visual grounding. In addition, we release a small dataset of video recordings used to evaluate the system. This dataset captures real-world interactions in which a human provides instructions in natural language to a robot, a contribution to future research on human-robot interaction. We discuss the strengths and limitations of the system, with particular focus on how it handles multimodal command interpretation, and its ability to be integrated into symbolic robotic frameworks for safe and explainable decision-making.
Solving Multi-Goal Robotic Tasks with Decision Transformer
Artificial intelligence plays a crucial role in robotics, with reinforcement learning (RL) emerging as one of the most promising approaches for robot control. However, several key challenges hinder its broader application. First, many RL methods rely on online learning, which requires either real-world hardware or advanced simulation environments--both of which can be costly, time-consuming, and impractical. Offline reinforcement learning offers a solution, enabling models to be trained without ongoing access to physical robots or simulations. A second challenge is learning multi-goal tasks, where robots must achieve multiple objectives simultaneously. This adds complexity to the training process, as the model must generalize across different goals. At the same time, transformer architectures have gained significant popularity across various domains, including reinforcement learning. Yet, no existing methods effectively combine offline training, multi-goal learning, and transformer-based architectures. In this paper, we address these challenges by introducing a novel adaptation of the decision transformer architecture for offline multi-goal reinforcement learning in robotics. Our approach integrates goal-specific information into the decision transformer, allowing it to handle complex tasks in an offline setting. To validate our method, we developed a new offline reinforcement learning dataset using the Panda robotic platform in simulation. Our extensive experiments demonstrate that the decision transformer can outperform state-of-the-art online reinforcement learning methods.
Meta-Learning Augmented MPC for Disturbance-Aware Motion Planning and Control of Quadrotors
A major challenge in autonomous flights is unknown disturbances, which can jeopardize safety and lead to collisions, especially in obstacle-rich environments. This paper presents a disturbance-aware motion planning and control framework designed for autonomous aerial flights. The framework is composed of two key components: a disturbance-aware motion planner and a tracking controller. The disturbance-aware motion planner consists of a predictive control scheme and a learned model of disturbances that is adapted online. The tracking controller is designed using contraction control methods to provide safety bounds on the quadrotor behaviour in the vicinity of the obstacles with respect to the disturbance-aware motion plan. Finally, the algorithm is tested in simulation scenarios with a quadrotor facing strong crosswind and ground-induced disturbances.
An Algorithm for Distributed Computation of Reachable Sets for Multi-Agent Systems
In this paper, we consider the problem of distributed reachable set computation for multi-agent systems (MASs) interacting over an undirected, stationary graph. A full state-feedback control input for such MASs depends no only on the current agent's state, but also of its neighbors. However, in most MAS applications, the dynamics are obscured by individual agents. This makes reachable set computation, in a fully distributed manner, a challenging problem. We utilize the ideas of polytopic reachable set approximation and generalize it to a MAS setup. We formulate the resulting sub-problems in a fully distributed manner and provide convergence guarantees for the associated computations. The proposed algorithm's convergence is proved for two cases: static MAS graphs, and time-varying graphs under certain restrictions.
comment: 10 pages, 4 figures, 1 algorithm float. Preprint submitted to ACC 2025 for review
Incremental Learning for Robot Shared Autonomy
Shared autonomy holds promise for improving the usability and accessibility of assistive robotic arms, but current methods often rely on costly expert demonstrations and lack the ability to adapt post-deployment. This paper introduces ILSA, an Incrementally Learned Shared Autonomy framework that continually improves its assistive control policy through repeated user interactions. ILSA leverages synthetic kinematic trajectories for initial pretraining, reducing the need for expert demonstrations, and then incrementally finetunes its policy after each manipulation interaction, with mechanisms to balance new knowledge acquisition with existing knowledge retention during incremental learning. We validate ILSA for complex long-horizon tasks through a comprehensive ablation study and a user study with 20 participants, demonstrating its effectiveness and robustness in both quantitative performance and user-reported qualitative metrics. Code and videos are available at https://ilsa-robo.github.io/.
A General Formulation for Path Constrained Time-Optimized Trajectory Planning with Environmental and Object Contacts
A typical manipulation task consists of a manipulator equipped with a gripper to grasp and move an object with constraints on the motion of the hand-held object, which may be due to the nature of the task itself or from object-environment contacts. In this paper, we study the problem of computing joint torques and grasping forces for time-optimal motion of an object, while ensuring that the grasp is not lost and any constraints on the motion of the object, either due to dynamics, environment contact, or no-slip requirements, are also satisfied. We present a second-order cone program (SOCP) formulation of the time-optimal trajectory planning problem that considers nonlinear friction cone constraints at the hand-object and object-environment contacts. Since SOCPs are convex optimization problems that can be solved optimally in polynomial time using interior point methods, we can solve the trajectory optimization problem efficiently. We present simulation results on three examples, including a non-prehensile manipulation task, which shows the generality and effectiveness of our approach.
A New Architecture for Neural Enhanced Multiobject Tracking
Multiobject tracking (MOT) is an important task in robotics, autonomous driving, and maritime surveillance. Traditional work on MOT is model-based and aims to establish algorithms in the framework of sequential Bayesian estimation. More recent methods are fully data-driven and rely on the training of neural networks. The two approaches have demonstrated advantages in certain scenarios. In particular, in problems where plenty of labeled data for the training of neural networks is available, data-driven MOT tends to have advantages compared to traditional methods. A natural thought is whether a general and efficient framework can integrate the two approaches. This paper advances a recently introduced hybrid model-based and data-driven method called neural-enhanced belief propagation (NEBP). Compared to existing work on NEBP for MOT, it introduces a novel neural architecture that can improve data association and new object initialization, two critical aspects of MOT. The proposed tracking method is leading the nuScenes LiDAR-only tracking challenge at the time of submission of this paper.
Monocular Visual Place Recognition in LiDAR Maps via Cross-Modal State Space Model and Multi-View Matching
Achieving monocular camera localization within pre-built LiDAR maps can bypass the simultaneous mapping process of visual SLAM systems, potentially reducing the computational overhead of autonomous localization. To this end, one of the key challenges is cross-modal place recognition, which involves retrieving 3D scenes (point clouds) from a LiDAR map according to online RGB images. In this paper, we introduce an efficient framework to learn descriptors for both RGB images and point clouds. It takes visual state space model (VMamba) as the backbone and employs a pixel-view-scene joint training strategy for cross-modal contrastive learning. To address the field-of-view differences, independent descriptors are generated from multiple evenly distributed viewpoints for point clouds. A visible 3D points overlap strategy is then designed to quantify the similarity between point cloud views and RGB images for multi-view supervision. Additionally, when generating descriptors from pixel-level features using NetVLAD, we compensate for the loss of geometric information, and introduce an efficient scheme for multi-view generation. Experimental results on the KITTI and KITTI-360 datasets demonstrate the effectiveness and generalization of our method. The code will be released upon acceptance.
BoxMap: Efficient Structural Mapping and Navigation ICRA 2025
While humans can successfully navigate using abstractions, ignoring details that are irrelevant to the task at hand, most existing robotic applications require the maintenance of a detailed environment representation which consumes a significant amount of sensing, computing, and storage. These issues are particularly important in a resource-constrained setting with limited power budget. Deep learning methods can learn from prior experience to abstract knowledge of unknown environments, and use it to execute tasks (e.g., frontier exploration, object search, or scene understanding) more efficiently. We propose BoxMap, a Detection-Transformer-based architecture that takes advantage of the structure of the sensed partial environment to update a topological graph of the environment as a set of semantic entities (e.g. rooms and doors) and their relations (e.g. connectivity). These predictions from low-level measurements can then be leveraged to achieve high-level goals with lower computational costs than methods based on detailed representations. As an example application, we consider a robot equipped with a 2-D laser scanner tasked with exploring a residential building. Our BoxMap representation scales quadratically with the number of rooms (with a small constant), resulting in significant savings over a full geometric map. Moreover, our high-level topological representation results in 30.9% shorter trajectories in the exploration task with respect to a standard method.
comment: This manuscript has been submitted to IEEE ICRA 2025
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Enabling robots to autonomously navigate unknown, complex, dynamic environments and perform diverse tasks remains a fundamental challenge in developing robust autonomous physical agents. They must effectively perceive their surroundings while leveraging world knowledge for decision-making. While recent approaches utilize vision-language and large language models for scene understanding and planning, they often rely on offline processing, external computing, or restrictive environmental assumptions. We present a novel framework for efficient and scalable real-time, onboard autonomous navigation that integrates multi-level abstraction in both perception and planning in unknown large-scale environments that change over time. Our system fuses data from multiple onboard sensors for localization and mapping and integrates it with open-vocabulary semantics to generate hierarchical scene graphs. An LLM-based planner leverages these graphs to generate high-level task execution strategies, which guide low-level controllers in safely accomplishing goals. Our framework's real-time operation enables continuous updates to scene graphs and plans, allowing swift responses to environmental changes and on-the-fly error correction. This is a key advantage over static or rule-based planning systems. We demonstrate our system's efficacy on a quadruped robot navigating large-scale, dynamic environments, showcasing its adaptability and robustness in diverse scenarios.
BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation
To operate at a building scale, service robots must perform very long-horizon mobile manipulation tasks by navigating to different rooms, accessing different floors, and interacting with a wide and unseen range of everyday objects. We refer to these tasks as Building-wide Mobile Manipulation. To tackle these inherently long-horizon tasks, we introduce BUMBLE, a unified Vision-Language Model (VLM)-based framework integrating open-world RGBD perception, a wide spectrum of gross-to-fine motor skills, and dual-layered memory. Our extensive evaluation (90+ hours) indicates that BUMBLE outperforms multiple baselines in long-horizon building-wide tasks that require sequencing up to 12 ground truth skills spanning 15 minutes per trial. BUMBLE achieves 47.1% success rate averaged over 70 trials in different buildings, tasks, and scene layouts from different starting rooms and floors. Our user study demonstrates 22% higher satisfaction with our method than state-of-the-art mobile manipulation methods. Finally, we demonstrate the potential of using increasingly-capable foundation models to push performance further. For more information, see https://robin-lab.cs.utexas.edu/BUMBLE/
comment: 7 Figures, 2 Tables, 11 Pages
Hibikino-Musashi@Home 2024 Team Description Paper
This paper provides an overview of the techniques employed by Hibikino-Musashi@Home, which intends to participate in the domestic standard platform league. The team has developed a dataset generator for training a robot vision system and an open-source development environment running on a Human Support Robot simulator. The large language model powered task planner selects appropriate primitive skills to perform the task requested by users. The team aims to design a home service robot that can assist humans in their homes and continuously attends competitions to evaluate and improve the developed system.
GSLoc: Visual Localization with 3D Gaussian Splatting
We present GSLoc: a new visual localization method that performs dense camera alignment using 3D Gaussian Splatting as a map representation of the scene. GSLoc backpropagates pose gradients over the rendering pipeline to align the rendered and target images, while it adopts a coarse-to-fine strategy by utilizing blurring kernels to mitigate the non-convexity of the problem and improve the convergence. The results show that our approach succeeds at visual localization in challenging conditions of relatively small overlap between initial and target frames inside textureless environments when state-of-the-art neural sparse methods provide inferior results. Using the byproduct of realistic rendering from the 3DGS map representation, we show how to enhance localization results by mixing a set of observed and virtual reference keyframes when solving the image retrieval problem. We evaluate our method both on synthetic and real-world data, discussing its advantages and application potential.
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
We present GR-2, a state-of-the-art generalist robot agent for versatile and generalizable robot manipulation. GR-2 is first pre-trained on a vast number of Internet videos to capture the dynamics of the world. This large-scale pre-training, involving 38 million video clips and over 50 billion tokens, equips GR-2 with the ability to generalize across a wide range of robotic tasks and environments during subsequent policy learning. Following this, GR-2 is fine-tuned for both video generation and action prediction using robot trajectories. It exhibits impressive multi-task learning capabilities, achieving an average success rate of 97.7% across more than 100 tasks. Moreover, GR-2 demonstrates exceptional generalization to new, previously unseen scenarios, including novel backgrounds, environments, objects, and tasks. Notably, GR-2 scales effectively with model size, underscoring its potential for continued growth and application. Project page: \url{https://gr2-manipulation.github.io}.
comment: Tech Report. Authors are listed in alphabetical order. Project page: https://gr2-manipulation.github.io
Provable Methods for Searching with an Imperfect Sensor
Assume that a target is known to be present at an unknown point among a finite set of locations in the plane. We search for it using a mobile robot that has imperfect sensing capabilities. It takes time for the robot to move between locations and search a location; we have a total time budget within which to conduct the search. We study the problem of computing a search path/strategy for the robot that maximizes the probability of detection of the target. Considering non-uniform travel times between points (e.g., based on the distance between them) is crucial for search and rescue applications; such problems have been investigated to a limited extent due to their inherent complexity. In this paper, we describe fast algorithms with performance guarantees for this search problem and some variants, complement them with complexity results, and perform experiments to observe their performance.
comment: 10 pages, 6 figures, 3 algorithms
Concurrent-Learning Based Relative Localization in Shape Formation of Robot Swarms
In this paper, we address the shape formation problem for massive robot swarms in environments where external localization systems are unavailable. Achieving this task effectively with solely onboard measurements is still scarcely explored and faces some practical challenges. To solve this challenging problem, we propose the following novel results. Firstly, to estimate the relative positions among neighboring robots, a concurrent-learning based estimator is proposed. It relaxes the persistent excitation condition required in the classical ones such as least-square estimator. Secondly, we introduce a finite-time agreement protocol to determine the shape location. This is achieved by estimating the relative position between each robot and a randomly assigned seed robot. The initial position of the seed one marks the shape location. Thirdly, based on the theoretical results of the relative localization, a novel behavior-based control strategy is devised. This strategy not only enables adaptive shape formation of large group of robots but also enhances the observability of inter-robot relative localization. Numerical simulation results are provided to verify the performance of our proposed strategy compared to the state-of-the-art ones. Additionally, outdoor experiments on real robots further demonstrate the practical effectiveness and robustness of our methods.
QT-DoG: Quantization-aware Training for Domain Generalization
Domain Generalization (DG) aims to train models that perform well not only on the training (source) domains but also on novel, unseen target data distributions. A key challenge in DG is preventing overfitting to source domains, which can be mitigated by finding flatter minima in the loss landscape. In this work, we propose Quantization-aware Training for Domain Generalization (QT-DoG) and demonstrate that weight quantization effectively leads to flatter minima in the loss landscape, thereby enhancing domain generalization. Unlike traditional quantization methods focused on model compression, QT-DoG exploits quantization as an implicit regularizer by inducing noise in model weights, guiding the optimization process toward flatter minima that are less sensitive to perturbations and overfitting. We provide both theoretical insights and empirical evidence demonstrating that quantization inherently encourages flatter minima, leading to better generalization across domains. Moreover, with the benefit of reducing the model size through quantization, we demonstrate that an ensemble of multiple quantized models further yields superior accuracy than the state-of-the-art DG approaches with no computational or memory overheads. Our extensive experiments demonstrate that QT-DoG generalizes across various datasets, architectures, and quantization algorithms, and can be combined with other DG methods, establishing its versatility and robustness.
comment: Code will be released soon
SplaTraj: Camera Trajectory Generation with Semantic Gaussian Splatting
Many recent developments for robots to represent environments have focused on photorealistic reconstructions. This paper particularly focuses on generating sequences of images from the photorealistic Gaussian Splatting models, that match instructions that are given by user-inputted language. We contribute a novel framework, SplaTraj, which formulates the generation of images within photorealistic environment representations as a continuous-time trajectory optimization problem. Costs are designed so that a camera following the trajectory poses will smoothly traverse through the environment and render the specified spatial information in a photogenic manner. This is achieved by querying a photorealistic representation with language embedding to isolate regions that correspond to the user-specified inputs. These regions are then projected to the camera's view as it moves over time and a cost is constructed. We can then apply gradient-based optimization and differentiate through the rendering to optimize the trajectory for the defined cost. The resulting trajectory moves to photogenically view each of the specified objects. We empirically evaluate our approach on a suite of environments and instructions, and demonstrate the quality of generated image sequences.
Sitting, Standing and Walking Control of the Series-Parallel Hybrid Recupera-Reha Exoskeleton
This paper presents advancements in the functionalities of the Recupera-Reha lower extremity exoskeleton robot. The exoskeleton features a series-parallel hybrid design characterized by multiple kinematic loops resulting in 148 degrees of freedom in its spanning tree and 102 independent loop closure constraints, which poses significant challenges for modeling and control. To address these challenges, we applied an optimal control approach to generate feasible trajectories such as sitting, standing, and static walking, and tested these trajectories on the exoskeleton robot. Our method efficiently solves the optimal control problem using a serial abstraction of the model to generate trajectories. It then utilizes the full series-parallel hybrid model, which takes all the kinematic loop constraints into account to generate the final actuator commands. The experimental results demonstrate the effectiveness of our approach in generating the desired motions for the exoskeleton.
comment: 8 pages, 16 figures, IEEE-RAS International Conference on Humanoid Robots 2024
AIVIO: Closed-loop, Object-relative Navigation of UAVs with AI-aided Visual Inertial Odometry
Object-relative mobile robot navigation is essential for a variety of tasks, e.g. autonomous critical infrastructure inspection, but requires the capability to extract semantic information about the objects of interest from raw sensory data. While deep learning-based (DL) methods excel at inferring semantic object information from images, such as class and relative 6 degree of freedom (6-DoF) pose, they are computationally demanding and thus often not suitable for payload constrained mobile robots. In this letter we present a real-time capable unmanned aerial vehicle (UAV) system for object-relative, closed-loop navigation with a minimal sensor configuration consisting of an inertial measurement unit (IMU) and RGB camera. Utilizing a DL-based object pose estimator, solely trained on synthetic data and optimized for companion board deployment, the object-relative pose measurements are fused with the IMU data to perform object-relative localization. We conduct multiple real-world experiments to validate the performance of our system for the challenging use case of power pole inspection. An example closed-loop flight is presented in the supplementary video.
comment: Accepted for publication in the IEEE Robotics and Automation Letters (RA-L), 2024
DeMo: Decoupling Motion Forecasting into Directional Intentions and Dynamic States NeurIPS 2024
Accurate motion forecasting for traffic agents is crucial for ensuring the safety and efficiency of autonomous driving systems in dynamically changing environments. Mainstream methods adopt a one-query-one-trajectory paradigm, where each query corresponds to a unique trajectory for predicting multi-modal trajectories. While straightforward and effective, the absence of detailed representation of future trajectories may yield suboptimal outcomes, given that the agent states dynamically evolve over time. To address this problem, we introduce DeMo, a framework that decouples multi-modal trajectory queries into two types: mode queries capturing distinct directional intentions and state queries tracking the agent's dynamic states over time. By leveraging this format, we separately optimize the multi-modality and dynamic evolutionary properties of trajectories. Subsequently, the mode and state queries are integrated to obtain a comprehensive and detailed representation of the trajectories. To achieve these operations, we additionally introduce combined Attention and Mamba techniques for global information aggregation and state sequence modeling, leveraging their respective strengths. Extensive experiments on both the Argoverse 2 and nuScenes benchmarks demonstrate that our DeMo achieves state-of-the-art performance in motion forecasting.
comment: NeurIPS 2024
CubiX: Portable Wire-Driven Parallel Robot Connecting to and Utilizing the Environment IROS2024
A wire-driven parallel robot is a type of robotic system where multiple wires are used to control the movement of a end-effector. The wires are attached to the end-effector and anchored to fixed points on external structures. This configuration allows for the separation of actuators and end-effectors, enabling lightweight and simplified movable parts in the robot. However, its range of motion remains confined within the space formed by the wires, limiting the wire-driven capability to only within the pre-designed operational range. Here, in this study, we develop a wire-driven robot, CubiX, capable of connecting to and utilizing the environment. CubiX connects itself to the environment using up to 8 wires and drives itself by winding these wires. By integrating actuators for winding the wires into CubiX, a portable wire-driven parallel robot is realized without limitations on its workspace. Consequently, the robot can form parallel wire-driven structures by connecting wires to the environment at any operational location.
comment: Accepted at IROS2024, website - https://shin0805.github.io/cubix-hardware/ , YouTube - https://youtu.be/R5ZrzMPEFZs
Construction of Musculoskeletal Simulation for Shoulder Complex with Ligaments and Its Validation via Model Predictive Control IROS2024
The complex ways in which humans utilize their bodies in sports and martial arts are remarkable, and human motion analysis is one of the most effective tools for robot body design and control. On the other hand, motion analysis is not easy, and it is difficult to measure complex body motions in detail due to the influence of numerous muscles and soft tissues, mainly ligaments. In response, various musculoskeletal simulators have been developed and applied to motion analysis and robotics. However, none of them reproduce the ligaments but only the muscles, nor do they focus on the shoulder complex, including the clavicle and scapula, which is one of the most complex parts of the body. Therefore, in this study, a detailed simulation model of the shoulder complex including ligaments is constructed. The model will mimic not only the skeletal structure and muscle arrangement but also the ligament arrangement and maximum muscle strength. Through model predictive control based on the constructed simulation, we confirmed that the ligaments contribute to joint stabilization in the first movement and that the proper distribution of maximum muscle force contributes to the equalization of the load on each muscle, demonstrating the effectiveness of this simulation.
comment: accepted at IROS2024, websites - https://sahara-yuta.github.io/projects/shoulder-complex-simulation
Towards an Autonomous Surface Vehicle Prototype for Artificial Intelligence Applications of Water Quality Monitoring
The use of Autonomous Surface Vehicles, equipped with water quality sensors and artificial vision systems, allows for a smart and adaptive deployment in water resources environmental monitoring. This paper presents a real implementation of a vehicle prototype that to address the use of Artificial Intelligence algorithms and enhanced sensing techniques for water quality monitoring. The vehicle is fully equipped with high-quality sensors to measure water quality parameters and water depth. Furthermore, by means of a stereo-camera, it also can detect and locate macro-plastics in real environments by means of deep visual models, such as YOLOv5. In this paper, experimental results, carried out in Lago Mayor (Sevilla), has been presented as proof of the capabilities of the proposed architecture. The overall system, and the early results obtained, are expected to provide a solid example of a real platform useful for the water resource monitoring task, and to serve as a real case scenario for deploying Artificial Intelligence algorithms, such as path planning, artificial vision, etc.
A Robust Quadruped Robot with Twisting Waist for Flexible Motions
The waist plays a crucial role in the agile movement of many animals in nature. It provides the torso with additional degrees of freedom and flexibility, inspiring researchers to incorporate this biological feature into robotic structures to enhance robot locomotion. This paper presents a cost-effective and low-complexity waist mechanism integrated into the structure of the open-source robot solo8, adding a new degree of freedom (DOF) to its torso. We refer to this novel robot as solo9. Additionally, we propose a full-body control method for the waist-equipped quadruped robot based on generative adversarial imitation learning (GAIL). During training, the discriminator is used as input for iterative optimization of the policy and dataset, enabling solo9 to achieve flexible steering maneuvers across various gaits. Extensive tests of solo9's steering capabilities, terrain adaptability, and robustness are conducted in both simulation and real-world scenarios, with detailed comparisons to solo8 and solo12, demonstrating the effectiveness of the control algorithm and the advantages of the waist mechanism.
Unobserved Object Detection using Generative Models
Can we detect an object that is not visible in an image? This study introduces the novel task of 2D and 3D unobserved object detection for predicting the location of objects that are occluded or lie outside the image frame. We adapt several state-of-the-art pre-trained generative models to solve this task, including 2D and 3D diffusion models and vision--language models, and show that they can be used to infer the presence of objects that are not directly observed. To benchmark this task, we propose a suite of metrics that captures different aspects of performance. Our empirical evaluations on indoor scenes from the RealEstate10k dataset with COCO object categories demonstrate results that motivate the use of generative models for the unobserved object detection task. The current work presents a promising step towards compelling applications like visual search and probabilistic planning that can leverage object detection beyond what can be directly observed.
comment: 16 pages; 41 figures
A GPT-based Decision Transformer for Multi-Vehicle Coordination at Unsignalized Intersections
In this paper, we explore the application of the Decision Transformer, a decision-making algorithm based on the Generative Pre-trained Transformer (GPT) architecture, to multi-vehicle coordination at unsignalized intersections. We formulate the coordination problem so as to find the optimal trajectories for multiple vehicles at intersections, modeling it as a sequence prediction task to fully leverage the power of GPTs as a sequence model. Through extensive experiments, we compare our approach to a reservation-based intersection management system. Our results show that the Decision Transformer can outperform the training data in terms of total travel time and can be generalized effectively to various scenarios, including noise-induced velocity variations, continuous interaction environments, and different vehicle numbers and road configurations.
comment: 7 pages
Effort Allocation for Deadline-Aware Task and Motion Planning: A Metareasoning Approach
In robot planning, tasks can often be achieved through multiple options, each consisting of several actions. This work specifically addresses deadline constraints in task and motion planning, aiming to find a plan that can be executed within the deadline despite uncertain planning and execution times. We propose an effort allocation problem, formulated as a Markov decision process (MDP), to find such a plan by leveraging metareasoning perspectives to allocate computational resources among the given options. We formally prove the NP-hardness of the problem by reducing it from the knapsack problem. Both a model-based approach, where transition models are learned from past experience, and a model-free approach, which overcomes the unavailability of prior data acquisition through reinforcement learning, are explored. For the model-based approach, we investigate Monte Carlo tree search (MCTS) to approximately solve the proposed MDP and further design heuristic schemes to tackle NP-hardness, leading to the approximate yet efficient algorithm called DP_Rerun. In experiments, DP_Rerun demonstrates promising performance comparable to MCTS while requiring negligible computation time.
comment: 48 pages, 6 figures
Single Actuator Undulation Soft-bodied Robots Using A Precompressed Variable Thickness Flexible Beam IROS 2024
Soft robots - due to their intrinsic flexibility of the body - can adaptively navigate unstructured environments. One of the most popular locomotion gaits that has been implemented in soft robots is undulation. The undulation motion in soft robots resembles the locomotion gait of stringy creatures such as snakes, eels, and C. Elegans. Typically, the implementation of undulation locomotion on a soft robot requires many actuators to control each segment of the stringy body. The added weight of multiple actuators limits the navigating performance of soft-bodied robots. In this paper, we propose a simple tendon-driven flexible beam with only one actuator (a DC motor) that can generate a mechanical traveling wave along the beam to support the undulation locomotion of soft robots. The beam will be precompressed along its axis by shortening the length of the two tendons to form an S-shape, thus pretensioning the tendons. The motor will wind and unwind the tendons to deform the flexible beam and generate traveling waves along the body of the robot. We experiment with different pre-tension to characterize the relationship between tendon pre-tension forces and the DC-motor winding/unwinding. Our proposal enables a simple implementation of undulation motion to support the locomotion of soft-bodied robots.
comment: Accepted to IROS 2024
Integrating Online Learning and Connectivity Maintenance for Communication-Aware Multi-Robot Coordination IROS 2024
This paper proposes a novel data-driven control strategy for maintaining connectivity in networked multi-robot systems. Existing approaches often rely on a pre-determined communication model specifying whether pairwise robots can communicate given their relative distance to guide the connectivity-aware control design, which may not capture real-world communication conditions. To relax that assumption, we present the concept of Data-driven Connectivity Barrier Certificates, which utilize Control Barrier Functions (CBF) and Gaussian Processes (GP) to characterize the admissible control space for pairwise robots based on communication performance observed online. This allows robots to maintain a satisfying level of pairwise communication quality (measured by the received signal strength) while in motion. Then we propose a Data-driven Connectivity Maintenance (DCM) algorithm that combines (1) online learning of the communication signal strength and (2) a bi-level optimization-based control framework for the robot team to enforce global connectivity of the realistic multi-robot communication graph and minimally deviate from their task-related motions. We provide theoretical proofs to justify the properties of our algorithm and demonstrate its effectiveness through simulations with up to 20 robots.
comment: 8 pages, accepted to IROS 2024
Hybrid Gripper with Passive Pneumatic Soft Joints for Grasping Deformable Thin Objects
Grasping a variety of objects remains a key challenge in the development of versatile robotic systems. The human hand is remarkably dexterous, capable of grasping and manipulating objects with diverse shapes, mechanical properties, and textures. Inspired by how humans use two fingers to pick up thin and large objects such as fabric or sheets of paper, we aim to develop a gripper optimized for grasping such deformable objects. Observing how the soft and flexible fingertip joints of the hand approach and grasp thin materials, a hybrid gripper design that incorporates both soft and rigid components was proposed. The gripper utilizes a soft pneumatic ring wrapped around a rigid revolute joint to create a flexible two-fingered gripper. Experiments were conducted to characterize and evaluate the gripper performance in handling sheets of paper and other objects. Compared to rigid grippers, the proposed design improves grasping efficiency and reduces the gripping distance by up to eightfold.
Viscoelasticity Estimation of Sports Prosthesis by Energy-minimizing Inverse Kinematics and Its Validation by Forward Dynamics
In this study, we present a method for estimating the viscoelasticity of a leaf-spring sports prosthesis using advanced energy minimizing inverse kinematics based on the Piece-wise Constant Strain (PCS) model to reconstruct the three-dimensional dynamic behavior. Dynamic motion analysis of the athlete and prosthesis is important to clarify the effect of prosthesis characteristics on foot function. However, three-dimensional deformation calculations of the prosthesis and viscoelasticity have rarely been investigated. In this letter, we apply the PCS model to a prosthesis deformation, which can calculate flexible deformation with low computational cost and handle kinematics and dynamics. In addition, we propose an inverse kinematics calculation method that is consistent with the material properties of the prosthesis by considering the minimization of elastic energy. Furthermore, we propose a method to estimate the viscoelasticity by solving a quadratic programming based on the measured motion capture data. The calculated strains are more reasonable than the results obtained by conventional inverse kinematics calculation. From the result of the viscoelasticity estimation, we simulate the prosthetic motion by forward dynamics calculation and confirm that this result corresponds to the measured motion. These results indicate that our approach adequately models the dynamic phenomena, including the viscoelasticity of the prosthesis.
Learning the Generalizable Manipulation Skills on Soft-body Tasks via Guided Self-attention Behavior Cloning Policy
Embodied AI represents a paradigm in AI research where artificial agents are situated within and interact with physical or virtual environments. Despite the recent progress in Embodied AI, it is still very challenging to learn the generalizable manipulation skills that can handle large deformation and topological changes on soft-body objects, such as clay, water, and soil. In this work, we proposed an effective policy, namely GP2E behavior cloning policy, which can guide the agent to learn the generalizable manipulation skills from soft-body tasks, including pouring, filling, hanging, excavating, pinching, and writing. Concretely, we build our policy from three insights:(1) Extracting intricate semantic features from point cloud data and seamlessly integrating them into the robot's end-effector frame; (2) Capturing long-distance interactions in long-horizon tasks through the incorporation of our guided self-attention module; (3) Mitigating overfitting concerns and facilitating model convergence to higher accuracy levels via the introduction of our two-stage fine-tuning strategy. Through extensive experiments, we demonstrate the effectiveness of our approach by achieving the 1st prize in the soft-body track of the ManiSkill2 Challenge at the CVPR 2023 4th Embodied AI workshop. Our findings highlight the potential of our method to improve the generalization abilities of Embodied AI models and pave the way for their practical applications in real-world scenarios.
Learning to Race in Extreme Turning Scene with Active Exploration and Gaussian Process Regression-based MPC
Extreme cornering in racing often induces large side-slip angles, presenting a formidable challenge in vehicle control. To tackle this issue, this paper introduces an Active Exploration with Double GPR (AEDGPR) system. The system initiates by planning a minimum-time trajectory with a Gaussian Process Regression(GPR) compensated model. The planning results show that in the cornering section, the yaw angular velocity and side-slip angle are in opposite directions, indicating that the vehicle is drifting. In response, we develop a drift controller based on Model Predictive Control (MPC) and incorporate Gaussian Process Regression to correct discrepancies in the vehicle dynamics model. Moreover, the covariance from the GPR is employed to actively explore various cornering states, aiming to minimize trajectory tracking errors. The proposed algorithm is validated through simulations on the Simulink-Carsim platform and experiments using a 1/10 scale RC vehicle.
Design, Localization, Perception, and Control for GPS-Denied Autonomous Aerial Grasping and Harvesting
In this paper, we present a comprehensive UAV system design to perform the highly complex task of off-centered aerial grasping. This task has several interdisciplinary research challenges which need to be addressed at once. The main design challenges are GPS-denied functionality, solely onboard computing, and avoiding off-the-shelf costly positioning systems. While in terms of algorithms, visual perception, localization, control, and grasping are the leading research problems. Hence in this paper, we make interdisciplinary contributions: (i) A detailed description of the fundamental challenges in indoor aerial grasping, (ii) a novel lightweight gripper design, (iii) a complete aerial platform design and in-lab fabrication, and (iv) localization, perception, control, grasping systems, and an end-to-end flight autonomy state-machine. Finally, we demonstrate the resulting aerial grasping system Drone-Bee achieving a high grasping rate for a highly challenging agricultural task of apple-like fruit harvesting, indoors in a vertical farming setting (Fig. 1). To our knowledge, such a system has not been previously discussed in the literature, and with its capabilities, this system pushes aerial manipulation towards 4th generation.
Thrust Microstepping via Acceleration Feedback in Quadrotor Control for Aerial Grasping of Dynamic Payload
In this work, we propose an end-to-end Thrust Microstepping and Decoupled Control (TMDC) of quadrotors. TMDC focuses on precise off-centered aerial grasping of payloads dynamically, which are attached rigidly to the UAV body via a gripper contrary to the swinging payload. The dynamic payload grasping quickly changes UAV's mass, inertia etc, causing instability while performing a grasping operation in-air. We identify that to handle unknown payload grasping, the role of thrust controller is crucial. Hence, we focus on thrust control without involving system parameters such as mass etc. TMDC is based on our novel Thrust Microstepping via Acceleration Feedback (TMAF) thrust controller and Decoupled Motion Control (DMC). TMAF precisely estimates the desired thrust even at smaller loop rates while DMC decouples the horizontal and vertical motion to counteract disturbances in the case of dynamic payloads. We prove the controller's efficacy via exhaustive experiments in practically interesting and adverse real-world cases, such as fully onboard state estimation without any positioning sensor, narrow and indoor flying workspaces with intense wind turbulence, heavy payloads, non-uniform loop rates, etc. Our TMDC outperforms recent direct acceleration feedback thrust controller (DA) and geometric tracking control (GT) in flying stably for aerial grasping and achieves RMSE below 0.04m in contrast to 0.15m of DA and 0.16m of GT.
Demonstration Based Explainable AI for Learning from Demonstration Methods
Learning from Demonstration (LfD) is a powerful type of machine learning that can allow novices to teach and program robots to complete various tasks. However, the learning process for these systems may still be difficult for novices to interpret and understand, making effective teaching challenging. Explainable artificial intelligence (XAI) aims to address this challenge by explaining a system to the user. In this work, we investigate XAI within LfD by implementing an adaptive explanatory feedback system on an inverse reinforcement learning (IRL) algorithm. The feedback is implemented by demonstrating selected learnt trajectories to users. The system adapts to user teaching by categorizing and then selectively sampling trajectories shown to a user, to show a representative sample of both successful and unsuccessful trajectories. The system was evaluated through a user study with 26 participants teaching a robot a navigation task. The results of the user study demonstrated that the proposed explanatory feedback system can improve robot performance, teaching efficiency and user understanding of the robot.
comment: 8 Pages, 9 Figures, 2 Tables, Submitted to RA-L
Whole-Body Dynamic Throwing with Legged Manipulators
Most robotic behaviours focus on either manipulation or locomotion, where tasks that require the integration of both, such as full-body throwing, remain under-explored. Throwing with a robot involves complex coordination between object manipulation and legged locomotion, which is crucial for advanced real-world interactions. This work investigates the challenge of full-body throwing in robotic systems and highlights the advantages of utilising the robot's entire body. We propose a deep reinforcement learning (RL) approach that leverages the robot's body to enhance throwing performance through a strategically designed curriculum to avoid local optima and sparse but informative reward functions to improve policy flexibility. The robot's body learns to generate additional momentum and fine-tune the projectile release velocity. Our full-body method achieves on average 47% greater throwing distance and 34% greater throwing accuracy than the arm alone, across two robot morphologies - an armed quadruped and a humanoid. We also extend our method to optimise robot stability during throws. The learned policy effectively generalises throwing to targets at any 3D point in space within a specified range, which has not previously been achieved and does so with human-level throwing accuracy. We successfully transferred this approach from simulation to a real robot using sim2real techniques, demonstrating its practical viability.
Abstract Hardware Grounding towards the Automated Design of Automation Systems
Crafting automation systems tailored for specific domains requires aligning the space of human experts' semantics with the space of robot executable actions, and scheduling the required resources and system layout accordingly. Regrettably, there are three major gaps, fine-grained domain-specific knowledge injection, heterogeneity between human knowledge and robot instructions, and diversity of users' preferences, resulting automation system design a case-by-case and labour-intensive effort, thus hindering the democratization of automation. We refer to this challenging alignment as the abstract hardware grounding problem, where we firstly regard the procedural operations in humans' semantics space as the abstraction of hardware requirements, then we ground such abstractions to instantiated hardware devices, subject to constraints and preferences in the real world -- optimizing this problem is essentially standardizing and automating the design of automation systems. On this basis, we develop an automated design framework in a hybrid data-driven and principle-derived fashion. Results on designing self-driving laboratories for enhancing experiment-driven scientific discovery suggest our framework's potential to produce compact systems that fully satisfy domain-specific and user-customized requirements with no redundancy.
comment: In International Conference on Intelligent Robotics and Applications (ICIRA'24)
Towards Robust Spacecraft Trajectory Optimization via Transformers
Future multi-spacecraft missions require robust autonomous trajectory optimization capabilities to ensure safe and efficient rendezvous operations. This capability hinges on solving non-convex optimal control problems in real time, although traditional iterative methods such as sequential convex programming impose significant computational challenges. To mitigate this burden, the Autonomous Rendezvous Transformer introduced a generative model trained to provide near-optimal initial guesses. This approach provides convergence to better local optima (e.g., fuel optimality), improves feasibility rates, and results in faster convergence speed of optimization algorithms through warm-starting. This work extends the capabilities of ART to address robust chance-constrained optimal control problems. Specifically, ART is applied to challenging rendezvous scenarios in Low Earth Orbit (LEO), ensuring fault-tolerant behavior under uncertainty. Through extensive experimentation, the proposed warm-starting strategy is shown to consistently produce high-quality reference trajectories, achieving up to 30% cost improvement and 50% reduction in infeasible cases compared to conventional methods, demonstrating robust performance across multiple state representations. Additionally, a post hoc evaluation framework is proposed to assess the quality of generated trajectories and mitigate runtime failures, marking an initial step toward the reliable deployment of AI-driven solutions in safety-critical autonomous systems such as spacecraft.
comment: Submitted to the IEEE Aerospace Conference 2025. 13 pages, 10 figures
Gen-Drive: Enhancing Diffusion Generative Driving Policies with Reward Modeling and Reinforcement Learning Fine-tuning
Autonomous driving necessitates the ability to reason about future interactions between traffic agents and to make informed evaluations for planning. This paper introduces the \textit{Gen-Drive} framework, which shifts from the traditional prediction and deterministic planning framework to a generation-then-evaluation planning paradigm. The framework employs a behavior diffusion model as a scene generator to produce diverse possible future scenarios, thereby enhancing the capability for joint interaction reasoning. To facilitate decision-making, we propose a scene evaluator (reward) model, trained with pairwise preference data collected through VLM assistance, thereby reducing human workload and enhancing scalability. Furthermore, we utilize an RL fine-tuning framework to improve the generation quality of the diffusion model, rendering it more effective for planning tasks. We conduct training and closed-loop planning tests on the nuPlan dataset, and the results demonstrate that employing such a generation-then-evaluation strategy outperforms other learning-based approaches. Additionally, the fine-tuned generative driving policy shows significant enhancements in planning performance. We further demonstrate that utilizing our learned reward model for evaluation or RL fine-tuning leads to better planning performance compared to relying on human-designed rewards. Project website: https://mczhi.github.io/GenDrive.
Submodular Optimization for Keyframe Selection & Usage in SLAM
Keyframes are LiDAR scans saved for future reference in Simultaneous Localization And Mapping (SLAM), but despite their central importance most algorithms leave choices of which scans to save and how to use them to wasteful heuristics. This work proposes two novel keyframe selection strategies for localization and map summarization, as well as a novel approach to submap generation which selects keyframes that best constrain localization. Our results show that online keyframe selection and submap generation reduce the number of saved keyframes and improve per scan computation time without compromising localization performance. We also present a map summarization feature for quickly capturing environments under strict map size constraints.
Control-oriented Clustering of Visual Latent Representation
We initiate a study of the geometry of the visual representation space -- the information channel from the vision encoder to the action decoder -- in an image-based control pipeline learned from behavior cloning. Inspired by the phenomenon of neural collapse (NC) in image classification, we investigate whether a similar law of clustering emerges in the visual representation space. Since image-based control is a regression task without explicitly defined classes, the central piece of the puzzle lies in determining according to what implicit classes the visual features cluster, if such a law exists. Focusing on image-based planar pushing, we posit the most important role of the visual representation in a control task is to convey a goal to the action decoder. We then classify training samples of expert demonstrations into eight "control-oriented" classes based on (a) the relative pose between the object and the target in the input or (b) the relative pose of the object induced by expert actions in the output, where one class corresponds to one relative pose orthant (REPO). Across four different instantiations of architecture, we report the prevalent emergence of control-oriented clustering in the visual representation space according to the eight REPOs. Beyond empirical observation, we show such a law of clustering can be leveraged as an algorithmic tool to improve test-time performance when training a policy with limited expert demonstrations. Particularly, we pretrain the vision encoder using NC as a regularization to encourage control-oriented clustering of the visual features. Surprisingly, such an NC-pretrained vision encoder, when finetuned end-to-end with the action decoder, boosts the test-time performance by 10% to 35% in the low-data regime. Real-world vision-based planar pushing experiments confirmed the surprising advantage of control-oriented visual representation pretraining.
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control
While MPC enables nonlinear feedback control by solving an optimal control problem at each timestep, the computational burden tends to be significantly large, making it difficult to optimize a policy within the control period. To address this issue, one possible approach is to utilize terminal value learning to reduce computational costs. However, the learned value cannot be used for other tasks in situations where the task dynamically changes in the original MPC setup. In this study, we develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization while reducing computational time. Furthermore, by using a hierarchical control structure that allows the upper-level trajectory planner to output appropriate goal-conditioned trajectories, we demonstrate that a robot model is able to generate diverse motions. We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control; thus, the robot successfully tracks a target trajectory on sloped terrain.
comment: 16 pages, 9 figures
Multimodal Active Measurement for Human Mesh Recovery in Close Proximity
For physical human-robot interactions (pHRI), a robot needs to estimate the accurate body pose of a target person. However, in these pHRI scenarios, the robot cannot fully observe the target person's body with equipped cameras because the target person must be close to the robot for physical interaction. This close distance leads to severe truncation and occlusions and thus results in poor accuracy of human pose estimation. For better accuracy in this challenging environment, we propose an active measurement and sensor fusion framework of the equipped cameras with touch and ranging sensors such as 2D LiDAR. Touch and ranging sensor measurements are sparse but reliable and informative cues for localizing human body parts. In our active measurement process, camera viewpoints and sensor placements are dynamically optimized to measure body parts with higher estimation uncertainty, which is closely related to truncation or occlusion. In our sensor fusion process, assuming that the measurements of touch and ranging sensors are more reliable than the camera-based estimations, we fuse the sensor measurements to the camera-based estimated pose by aligning the estimated pose towards the measured points. Our proposed method outperformed previous methods on the standard occlusion benchmark with simulated active measurement. Furthermore, our method reliably estimated human poses using a real robot, even with practical constraints such as occlusion by blankets.
comment: Accepted at Robotics and Automation Letters (RA-L) on Sep 2024
Feudal Networks for Visual Navigation
Visual navigation follows the intuition that humans can navigate without detailed maps. A common approach is interactive exploration while building a topological graph with images at nodes that can be used for planning. Recent variations learn from passive videos and can navigate using complex social and semantic cues. However, a significant number of training videos are needed, large graphs are utilized, and scenes are not unseen since odometry is utilized. We introduce a new approach to visual navigation using feudal learning, which employs a hierarchical structure consisting of a worker agent, a mid-level manager, and a high-level manager. Key to the feudal learning paradigm, agents at each level see a different aspect of the task and operate at different spatial and temporal scales. Two unique modules are developed in this framework. For the high-level manager, we learn a memory proxy map in a self supervised manner to record prior observations in a learned latent space and avoid the use of graphs and odometry. For the mid-level manager, we develop a waypoint network that outputs intermediate subgoals imitating human waypoint selection during local navigation. This waypoint network is pre-trained using a new, small set of teleoperation videos that we make publicly available, with training environments different from testing environments. The resulting feudal navigation network achieves near SOTA performance, while providing a novel no-RL, no-graph, no-odometry, no-metric map approach to the image goal navigation task.
Generative Image as Action Models
Image-generation diffusion models have been fine-tuned to unlock new capabilities such as image-editing and novel view synthesis. Can we similarly unlock image-generation models for visuomotor control? We present GENIMA, a behavior-cloning agent that fine-tunes Stable Diffusion to 'draw joint-actions' as targets on RGB images. These images are fed into a controller that maps the visual targets into a sequence of joint-positions. We study GENIMA on 25 RLBench and 9 real-world manipulation tasks. We find that, by lifting actions into image-space, internet pre-trained diffusion models can generate policies that outperform state-of-the-art visuomotor approaches, especially in robustness to scene perturbations and generalizing to novel objects. Our method is also competitive with 3D agents, despite lacking priors such as depth, keypoints, or motion-planners.
comment: CoRL 2024. Website, code, checkpoints: https://genima-robot.github.io/
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation
There is no limit to how much a robot might explore and learn, but all of that knowledge needs to be searchable and actionable. Within language research, retrieval augmented generation (RAG) has become the workhouse of large-scale non-parametric knowledge, however existing techniques do not directly transfer to the embodied domain, which is multimodal, data is highly correlated, and perception requires abstraction. To address these challenges, we introduce Embodied-RAG, a framework that enhances the foundational model of an embodied agent with a non-parametric memory system capable of autonomously constructing hierarchical knowledge for both navigation and language generation. Embodied-RAG handles a full range of spatial and semantic resolutions across diverse environments and query types, whether for a specific object or a holistic description of ambiance. At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail. This hierarchical organization allows the system to efficiently generate context-sensitive outputs across different robotic platforms. We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 200 explanation and navigation queries across 19 environments, highlighting its promise for general-purpose non-parametric system for embodied agents.
comment: Web: https://quanting-xie.github.io/Embodied-RAG-web/
Design and Experimental Study of Vacuum Suction Grabbing Technology to Grasp Fabric Piece
Vacuum Suction Grabbing Technology. The primary objective of this study was to design the grabbing technique used to determine the vacuum suction gripper and its design parameters for the pocket welting operation in apparel manufacturing. It presents the application of vacuum suction in grabbing technology, a technique that has revolutionized the handling and manipulation to grasp the various fabric materials in a range of garment industries. Vacuum suction, being non-intrusive and non-invasive, offers several advantages compared to traditional grabbing methods. It is particularly useful in scenarios where soft woven fabric and air-impermeable fabric items need to be handled with utmost care. The paper delves into the working principles of vacuum suction, its various components, and the underlying physics involved. Furthermore, it explores the various applications of vacuum suction in the garment industry into the automation exploration. The paper also highlights the challenges and limitations of vacuum suction technology and suggests potential areas for further research and development.
comment: 9 Pages, 3 figures, 6 diagrams, 1 table
Integrating One-Shot View Planning with a Single Next-Best View via Long-Tail Multiview Sampling
Existing view planning systems either adopt an iterative paradigm using next-best views (NBV) or a one-shot pipeline relying on the set-covering view-planning (SCVP) network. However, neither of these methods can concurrently guarantee both high-quality and high-efficiency reconstruction of 3D unknown objects. To tackle this challenge, we introduce a crucial hypothesis: with the availability of more information about the unknown object, the prediction quality of the SCVP network improves. There are two ways to provide extra information: (1) leveraging perception data obtained from NBVs, and (2) training on an expanded dataset of multiview inputs. In this work, we introduce a novel combined pipeline that incorporates a single NBV before activating the proposed multiview-activated (MA-)SCVP network. The MA-SCVP is trained on a multiview dataset generated by our long-tail sampling method, which addresses the issue of unbalanced multiview inputs and enhances the network performance. Extensive simulated experiments substantiate that our system demonstrates a significant surface coverage increase and a substantial 45% reduction in movement cost compared to state-of-the-art systems. Real-world experiments justify the capability of our system for generalization and deployment.
comment: Conditionally accepted by IEEE Transaction on Robotics, revised and resubmitted
Evaluating UAV Path Planning Algorithms for Realistic Maritime Search and Rescue Missions
Unmanned Aerial Vehicles (UAVs) are emerging as very important tools in search and rescue (SAR) missions at sea, enabling swift and efficient deployment for locating individuals or vessels in distress. The successful execution of these critical missions heavily relies on effective path planning algorithms that navigate UAVs through complex maritime environments while considering dynamic factors such as water currents and wind flow. Furthermore, they need to account for the uncertainty in search target locations. However, existing path planning methods often fail to address the inherent uncertainty associated with the precise location of search targets and the uncertainty of oceanic forces. In this paper, we develop a framework to develop and investigate trajectory planning algorithms for maritime SAR scenarios employing UAVs. We adopt it to compare multiple planning strategies, some of them used in practical applications by the United States Coast Guard. Furthermore, we propose a novel planner that aims at bridging the gap between computation heavy, precise algorithms and lightweight strategies applicable to real-world scenarios.
D(R, O) Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping
Dexterous grasping is a fundamental yet challenging skill in robotic manipulation, requiring precise interaction between robotic hands and objects. In this paper, we present D(R,O) Grasp, a novel framework that models the interaction between the robotic hand in its grasping pose and the object, enabling broad generalization across various robot hands and object geometries. Our model takes the robot hand's description and object point cloud as inputs and efficiently predicts kinematically valid and stable grasps, demonstrating strong adaptability to diverse robot embodiments and object geometries. Extensive experiments conducted in both simulated and real-world environments validate the effectiveness of our approach, with significant improvements in success rate, grasp diversity, and inference speed across multiple robotic hands. Our method achieves an average success rate of 87.53% in simulation in less than one second, tested across three different dexterous robotic hands. In real-world experiments using the LeapHand, the method also demonstrates an average success rate of 89%. D(R,O) Grasp provides a robust solution for dexterous grasping in complex and varied environments. The code, appendix, and videos are available on our project website at https://nus-lins-lab.github.io/drograspweb/.
LBR-Stack: ROS 2 and Python Integration of KUKA FRI for Med and IIWA Robots
The LBR-Stack is a collection of packages that simplify the usage and extend the capabilities of KUKA's Fast Robot Interface (FRI). It is designed for mission critical hard real-time applications. Supported are the KUKA LBR Med 7/14 and KUKA LBR IIWA 7/14 robots in the Gazebo simulation and for communication with real hardware.
comment: Under review at Journal of Open Source Software (JOSS)
First Place Solution to the ECCV 2024 BRAVO Challenge: Evaluating Robustness of Vision Foundation Models for Semantic Segmentation
In this report, we present the first place solution to the ECCV 2024 BRAVO Challenge, where a model is trained on Cityscapes and its robustness is evaluated on several out-of-distribution datasets. Our solution leverages the powerful representations learned by vision foundation models, by attaching a simple segmentation decoder to DINOv2 and fine-tuning the entire model. This approach outperforms more complex existing approaches, and achieves first place in the challenge. Our code is publicly available at https://github.com/tue-mps/benchmark-vfm-ss.
comment: v2 fixes ECE and FPR@95, among other small changes. arXiv admin note: substantial text overlap with arXiv:2409.15107
DenseMTL: Cross-task Attention Mechanism for Dense Multi-task Learning WACV
Multi-task learning has recently emerged as a promising solution for a comprehensive understanding of complex scenes. In addition to being memory-efficient, multi-task models, when appropriately designed, can facilitate the exchange of complementary signals across tasks. In this work, we jointly address 2D semantic segmentation and three geometry-related tasks: dense depth estimation, surface normal estimation, and edge estimation, demonstrating their benefits on both indoor and outdoor datasets. We propose a novel multi-task learning architecture that leverages pairwise cross-task exchange through correlation-guided attention and self-attention to enhance the overall representation learning for all tasks. We conduct extensive experiments across three multi-task setups, showing the advantages of our approach compared to competitive baselines in both synthetic and real-world benchmarks. Additionally, we extend our method to the novel multi-task unsupervised domain adaptation setting. Our code is available at https://github.com/cv-rits/DenseMTL
comment: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023
FoundationGrasp: Generalizable Task-Oriented Grasping with Foundation Models
Task-oriented grasping (TOG), which refers to synthesizing grasps on an object that are configurationally compatible with the downstream manipulation task, is the first milestone towards tool manipulation. Analogous to the activation of two brain regions responsible for semantic and geometric reasoning during cognitive processes, modeling the intricate relationship between objects, tasks, and grasps necessitates rich semantic and geometric prior knowledge about these elements. Existing methods typically restrict the prior knowledge to a closed-set scope, limiting their generalization to novel objects and tasks out of the training set. To address such a limitation, we propose FoundationGrasp, a foundation model-based TOG framework that leverages the open-ended knowledge from foundation models to learn generalizable TOG skills. Extensive experiments are conducted on the contributed Language and Vision Augmented TaskGrasp (LaViA-TaskGrasp) dataset, demonstrating the superiority of FoundationGrasp over existing methods when generalizing to novel object instances, object classes, and tasks out of the training set. Furthermore, the effectiveness of FoundationGrasp is validated in real-robot grasping and manipulation experiments on a 7-DoF robotic arm. Our code, data, appendix, and video are publicly available at https://sites.google.com/view/foundationgrasp.
comment: 18 pages, 13 figures
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
We introduce the novel task of interactive scene exploration, wherein robots autonomously explore environments and produce an action-conditioned scene graph (ACSG) that captures the structure of the underlying environment. The ACSG accounts for both low-level information (geometry and semantics) and high-level information (action-conditioned relationships between different entities) in the scene. To this end, we present the Robotic Exploration (RoboEXP) system, which incorporates the Large Multimodal Model (LMM) and an explicit memory design to enhance our system's capabilities. The robot reasons about what and how to explore an object, accumulating new information through the interaction process and incrementally constructing the ACSG. Leveraging the constructed ACSG, we illustrate the effectiveness and efficiency of our RoboEXP system in facilitating a wide range of real-world manipulation tasks involving rigid, articulated objects, nested objects, and deformable objects.
comment: Project Page: https://jianghanxiao.github.io/roboexp-web/
LANCAR: Leveraging Language for Context-Aware Robot Locomotion in Unstructured Environments
Navigating robots through unstructured terrains is challenging, primarily due to the dynamic environmental changes. While humans adeptly navigate such terrains by using context from their observations, creating a similar context-aware navigation system for robots is difficult. The essence of the issue lies in the acquisition and interpretation of context information, a task complicated by the inherent ambiguity of human language. In this work, we introduce LANCAR, which addresses this issue by combining a context translator with reinforcement learning (RL) agents for context-aware locomotion. LANCAR allows robots to comprehend context information through Large Language Models (LLMs) sourced from human observers and convert this information into actionable context embeddings. These embeddings, combined with the robot's sensor data, provide a complete input for the RL agent's policy network. We provide an extensive evaluation of LANCAR under different levels of context ambiguity and compare with alternative methods. The experimental results showcase the superior generalizability and adaptability across different terrains. Notably, LANCAR shows at least a 7.4% increase in episodic reward over the best alternatives, highlighting its potential to enhance robotic navigation in unstructured environments. More details and experiment videos could be found in http://raaslab.org/projects/LLM_Context_Estimation/
Multiagent Systems
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning
Reinforcement learning (RL) has emerged as a pivotal technique for fine-tuning large language models (LLMs) on specific tasks. However, prevailing RL fine-tuning methods predominantly rely on PPO and its variants. Though these algorithms are effective in general RL settings, they often exhibit suboptimal performance and vulnerability to distribution collapse when applied to the fine-tuning of LLMs. In this paper, we propose CORY, extending the RL fine-tuning of LLMs to a sequential cooperative multi-agent reinforcement learning framework, to leverage the inherent coevolution and emergent capabilities of multi-agent systems. In CORY, the LLM to be fine-tuned is initially duplicated into two autonomous agents: a pioneer and an observer. The pioneer generates responses based on queries, while the observer generates responses using both the queries and the pioneer's responses. The two agents are trained together. During training, the agents exchange roles periodically, fostering cooperation and coevolution between them. Experiments evaluate CORY's performance by fine-tuning GPT-2 and Llama-2 under subjective and objective reward functions on the IMDB Review and GSM8K datasets, respectively. Results show that CORY outperforms PPO in terms of policy optimality, resistance to distribution collapse, and training robustness, thereby underscoring its potential as a superior methodology for refining LLMs in real-world applications.
comment: 28 pages, 26 images
Concurrent-Learning Based Relative Localization in Shape Formation of Robot Swarms
In this paper, we address the shape formation problem for massive robot swarms in environments where external localization systems are unavailable. Achieving this task effectively with solely onboard measurements is still scarcely explored and faces some practical challenges. To solve this challenging problem, we propose the following novel results. Firstly, to estimate the relative positions among neighboring robots, a concurrent-learning based estimator is proposed. It relaxes the persistent excitation condition required in the classical ones such as least-square estimator. Secondly, we introduce a finite-time agreement protocol to determine the shape location. This is achieved by estimating the relative position between each robot and a randomly assigned seed robot. The initial position of the seed one marks the shape location. Thirdly, based on the theoretical results of the relative localization, a novel behavior-based control strategy is devised. This strategy not only enables adaptive shape formation of large group of robots but also enhances the observability of inter-robot relative localization. Numerical simulation results are provided to verify the performance of our proposed strategy compared to the state-of-the-art ones. Additionally, outdoor experiments on real robots further demonstrate the practical effectiveness and robustness of our methods.
Breaking the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning
Standard multi-agent reinforcement learning (MARL) algorithms are vulnerable to sim-to-real gaps. To address this, distributionally robust Markov games (RMGs) have been proposed to enhance robustness in MARL by optimizing the worst-case performance when game dynamics shift within a prescribed uncertainty set. Solving RMGs remains under-explored, from problem formulation to the development of sample-efficient algorithms. A notorious yet open challenge is if RMGs can escape the curse of multiagency, where the sample complexity scales exponentially with the number of agents. In this work, we propose a natural class of RMGs where the uncertainty set of each agent is shaped by both the environment and other agents' strategies in a best-response manner. We first establish the well-posedness of these RMGs by proving the existence of game-theoretic solutions such as robust Nash equilibria and coarse correlated equilibria (CCE). Assuming access to a generative model, we then introduce a sample-efficient algorithm for learning the CCE whose sample complexity scales polynomially with all relevant parameters. To the best of our knowledge, this is the first algorithm to break the curse of multiagency for RMGs.
Circular Distribution of Agents using Convex Layers
This article considers the problem of conflict-free distribution of agents on a circular periphery encompassing all agents. The two key elements of the proposed policy include the construction of a set of convex layers (nested convex polygons) using the initial positions of the agents, and a novel search space region for each of the agents. The search space for an agent on a convex layer is defined as the region enclosed between the lines passing through the agent's position and normal to its supporting edges. Guaranteeing collision-free paths, a goal assignment policy designates a unique goal position within the search space of an agent at the initial time itself, requiring no further computation thereafter. In contrast to the existing literature, this work presents a one-shot, collision-free solution to the circular distribution problem by utilizing only the initial positions of the agents. Illustrative examples demonstrate the effectiveness of the proposed policy.
Decentralized Stochastic Control in Standard Borel Spaces: Centralized MDP Reductions, Near Optimality of Finite Window Local Information, and Q-Learning
Decentralized stochastic control problems are intrinsically difficult to study because of the inapplicability of standard tools from centralized control such as dynamic programming and the resulting computational complexity. In this paper, we address some of these challenges for decentralized stochastic control with Borel spaces under three different but tightly related information structures under a unified theme: the one-step delayed information sharing pattern, the K-step periodic information sharing pattern, and the completely decentralized information structure where no sharing of information occurs. We will show that the one-step delayed and K-step periodic problems can be reduced to a centralized MDP, generalizing prior results which considered finite, linear, or static models, by addressing several measurability questions. The separated nature of policies under both information structures is then established. We then provide sufficient conditions for the transition kernels of both centralized reductions to be weak-Feller, which facilitates rigorous approximation and learning theoretic results. We will then show that for the completely decentralized control problem finite memory local policies are near optimal under a joint conditional mixing condition. This is achieved by obtaining a bound for finite memory policies which goes to zero as memory size increases. We will also provide a performance bound for the K-periodic problem, which results from replacing the full common information by a finite sliding window of information. The latter will depend on the condition of predictor stability in expected total variation, which we will establish. We finally show that under the periodic information sharing pattern, a quantized Q-learning algorithm converges asymptotically towards a near optimal solution. Each of the above, to our knowledge, is a new contribution to the literature.
comment: A summary of the results is to be presented in CDC'24
Systems and Control (CS)
Embedded State Estimation for Optimization of Cislunar Space Domain Awareness Constellation Design
The traffic in cislunar space is expected to increase over the coming years, leading to a higher likelihood of conjunction events among active satellites, orbital debris, and non-cooperative satellites. This increase necessitates enhanced space domain awareness (SDA) capabilities that include state estimation for targets of interest. Both Earth surface-based and space-based observation platforms in geosynchronous orbit or below face challenges such as range, exclusion, and occlusion that hinder observation. Motivated by the need to place space-based observers in the cislunar space regime to overcome these challenges, this paper proposes a cislunar SDA constellation design and analysis framework that integrates state estimation into an optimization problem for determining the placement of observers for optimal state estimation performance on a set of targets. The proposed multi-observer placement optimization problem samples from a range of possible target orbits. Upon convergence, the optimized constellation is validated against a broader set of targets to assess its effectiveness. Two comparative analyses are presented to evaluate the effects of changes in the sensor tasking procedure and sensor fidelity on the optimized constellation, comparing these to a single observer baseline case. The results demonstrate that the optimized constellations can provide accurate state estimation for various orbit families.
comment: 36 pages, 14 figures, Journal of Spacecraft and Rockets (accepted)
Work-in-Progress: Traded Control Transfer for Managing Real-Time Sensor Uncertainties in Autonomous Vehicle
At Levels 2 and 3 of autonomous driving defined by the Society of Auto-motive Engineers, drivers must take on certain driving responsibilities, and automated driving must sometimes yield to human control. This situation can occur in real time due to uncertainties in sensor measurements caused by environmental factors like fog or smoke. To address this challenge, we propose a method to manage real-time sensor uncertainties in autonomous vehicles by monitoring sensor conflicts and dynamically adjusting control authority to maintain safe operation. However, to achieve this, we have introduced a novel metric called the Degree of Conflicts (DoC), which quantifies the conflict between real-time sensor data by measuring the differences between data from multiple sensors. Our approach aims to demonstrate the importance of selecting an appropriate DoC threshold for transferring control between the automation agent and the human driver. The results have shown that choosing the correct DoC threshold can enhance safety by promptly handing over the driving control from the automation system to the human driver in challenging conditions.
comment: Peer-reviewed and accepted by the 2024 IEEE Real-Time Systems Symposium (RTSS)
Meta-Learning Augmented MPC for Disturbance-Aware Motion Planning and Control of Quadrotors
A major challenge in autonomous flights is unknown disturbances, which can jeopardize safety and lead to collisions, especially in obstacle-rich environments. This paper presents a disturbance-aware motion planning and control framework designed for autonomous aerial flights. The framework is composed of two key components: a disturbance-aware motion planner and a tracking controller. The disturbance-aware motion planner consists of a predictive control scheme and a learned model of disturbances that is adapted online. The tracking controller is designed using contraction control methods to provide safety bounds on the quadrotor behaviour in the vicinity of the obstacles with respect to the disturbance-aware motion plan. Finally, the algorithm is tested in simulation scenarios with a quadrotor facing strong crosswind and ground-induced disturbances.
An Algorithm for Distributed Computation of Reachable Sets for Multi-Agent Systems
In this paper, we consider the problem of distributed reachable set computation for multi-agent systems (MASs) interacting over an undirected, stationary graph. A full state-feedback control input for such MASs depends no only on the current agent's state, but also of its neighbors. However, in most MAS applications, the dynamics are obscured by individual agents. This makes reachable set computation, in a fully distributed manner, a challenging problem. We utilize the ideas of polytopic reachable set approximation and generalize it to a MAS setup. We formulate the resulting sub-problems in a fully distributed manner and provide convergence guarantees for the associated computations. The proposed algorithm's convergence is proved for two cases: static MAS graphs, and time-varying graphs under certain restrictions.
comment: 10 pages, 4 figures, 1 algorithm float. Preprint submitted to ACC 2025 for review
A Generalized Metriplectic System via Free Energy and System~Identification via Bilevel Convex Optimization
This work generalizes the classical metriplectic formalism to model Hamiltonian systems with nonconservative dissipation. Classical metriplectic representations allow for the description of energy conservation and production of entropy via a suitable selection of an entropy function and a bilinear symmetric metric. By relaxing the Casimir invariance requirement of the entropy function, this paper shows that the generalized formalism induces the free energy analogous to thermodynamics. The monotonic change of free energy can serve as a more precise criterion than mechanical energy or entropy alone. This paper provides examples of the generalized metriplectic system in a 2-dimensional Hamiltonian system and $\mathrm{SO}(3)$. This paper also provides a bilevel convex optimization approach for the identification of the metriplectic system given measurements of the system.
QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers
Queuing network control determines the allocation of scarce resources to manage congestion, a fundamental problem in manufacturing, communications, and healthcare. Compared to standard RL problems, queueing problems are distinguished by unique challenges: i) a system operating in continuous time, ii) high stochasticity, and iii) long horizons over which the system can become unstable (exploding delays). To spur methodological progress tackling these challenges, we present an open-sourced queueing simulation framework, QGym, that benchmark queueing policies across realistic problem instances. Our modular framework allows the researchers to build on our initial instances, which provide a wide range of environments including parallel servers, criss-cross, tandem, and re-entrant networks, as well as a realistically calibrated hospital queuing system. QGym makes it easy to compare multiple policies, including both model-free RL methods and classical queuing policies. Our testbed complements the traditional focus on evaluating algorithms based on mathematical guarantees in idealized settings, and significantly expands the scope of empirical benchmarking in prior work. QGym code is open-sourced at https://github.com/namkoong-lab/QGym.
Classification of simulation relations for symbolic control
Abstraction-based control design is a promising approach for ensuring safety-critical control of complex cyber-physical systems. A key aspect of this methodology is the relation between the original and abstract systems, which ensures that the abstract controller can be transformed into a valid controller for the original system through a concretization procedure. In this paper, we provide a comprehensive and systematic framework that characterizes various simulation relations, through their associated concretization procedures. We introduce the concept of augmented system, which universally enables a feedback refinement relation with the abstract system. This augmented system encapsulates the specific characteristics of each simulation relation within an interface, enabling a plug-and-play control architecture. Our results demonstrate that the existence of a particular simulation relation between the concrete and abstract systems is equivalent to the implementability of a specific control architecture, which depends on the considered simulation relation. This allows us to introduce new types of relations, and to establish the advantages and drawbacks of different relations, which we exhibit through detailed examples.
comment: 14 pages, 6 figures
Nationally Scalable Hydrogen Fueling Infrastructure Deployment: A Megaregion Analysis and Optimization Approach
Decarbonizing regional and long-haul freight faces challenges due to the limitations of battery-electric vehicles and infrastructure. Hydrogen fuel cell medium- and heavy-duty vehicles (MHDVs) present a promising alternative, aligning with the Department of Energy's decarbonization goals. Historically, alternative fuels like compressed natural gas and propane gas have seen slow adoption due to infrastructure barriers. To prevent similar setbacks, planning for zero-emission hydrogen fueling infrastructure is critical. This research develops plans for affordable and accessible hydrogen refueling stations, supporting the decarbonized freight system and benefiting underserved and rural communities by improving air quality, reducing noise pollution, and enhancing energy resilience.It provides a blueprint for replacing diesel in Class 8 trucks with hydrogen fueling solutions, focusing on the Texas Triangle Megaregion (I-45, I-35, I-10), the I-10 corridor between San Antonio, TX, and Los Angeles, CA, and the I-5/CA-99 corridors between Los Angeles and San Francisco. This area accounts for ~8.5% of U.S. heavy-duty freight volume. Using the OR-AGENT (Optimal Regional Architecture Generation for Electrified National Transport) framework, the study analyzes vehicles, freight networks, and energy systems. The framework integrates data on freight mobility, traffic, weather, and energy pathways to deliver optimized powertrain architectures and hydrogen fueling infrastructure deployment. It assesses all vehicle origin-destination pairs and feasible fueling station locations, using a genetic algorithm to identify the minimum number and optimal locations of hydrogen stations. It also determines fuel schedules and quantities, ensuring no vehicle is stranded. A deployment roadmap outlines strategic hydrogen refueling infrastructure rollout across multiple adoption scenarios.
Characterization of input-to-output stability for infinite dimensional systems
We prove a superposition theorem for input-to-output stability (IOS) of a broad class of nonlinear infinite-dimensional systems with outputs including both continuous-time and discrete-time systems. It contains, as a special case, the superposition theorem for input-to-state stability (ISS) of infinite-dimensional systems from [1] and the IOS superposition theorem for systems of ordinary differential equations from [2]. To achieve this result, we introduce and examine several novel stability and attractivity concepts for infinite dimensional systems with outputs: We prove criteria for the uniform limit property for systems with outputs, several of which are new already for systems with full-state output, we provide superposition theorems for systems which satisfy both the output-Lagrange stability property (OL) and IOS, give a sufficient condition for OL and characterize ISS in terms of IOS and input/output-to-state stability. Finally, by means of counterexamples, we illustrate the challenges appearing on the way of extension of the superposition theorems from [1] and [2] to infinite-dimensional systems with outputs.
Spectrally Efficient LDPC Codes For IRIG-106 Waveforms via Random Puncturing
Low-density parity-check (LDPC) codes form part of the IRIG-106 standard and have been successfully deployed for the Telemetry Group version of shaped-offset quadrature phase shift keying (SOQPSK-TG) modulation. Recently, LDPC code solutions have been proposed and optimized for continuous phase modulations (CPMs), including the pulse code modulation/frequency modulation (PCM/FM) and the multi-h CPM developed by the Advanced Range TeleMetry program (ARTM CPM). These codes were shown to perform around one dB from the respective channel capacities of these modulations. In this paper, we consider the effect of random puncturing of these LDPC codes to further improve spectrum efficiency. We present numerical simulation results that affirm the robust decoding performance promised by LDPC codes designed for ARTM CPM.
comment: Accepted for inclusion in the 2024 International Telemetry Conference
Privacy-aware Fully Model-Free Event-triggered Cloud-based HVAC Control
Privacy is a major concern when computing-as-a-service (CaaS) platforms, e.g., cloud-computing platforms, are utilized for building automation, as CaaS platforms can infer sensitive information, such as occupancy, using the sensor measurements of a building. Although the existing encrypted model-based control algorithms can ensure the security and privacy of sensor measurements, they are highly complex to implement and require high computational resources, which result in a high cost of using CaaS platforms. To address these issues, in this paper, we propose an encrypted fully model-free event-triggered cloud-based HVAC control framework that ensures the privacy of occupancy information and minimizes the communication and computation overhead associated with encrypted HVAC control. To this end, we first develop a model-free controller for regulating indoor temperature and CO2 levels. We then design a model-free event-triggering unit which reduces the communication and computation costs of encrypted HVAC control using an optimal triggering policy. Finally, we evaluate the performance of the proposed encrypted fully model-free event-triggered cloud-based HVAC control framework using the TRNSYS simulator, comparing it to an encrypted model-based event-triggered control framework, which uses model predictive control to regulate the indoor climate. Our numerical results demonstrate that, compared to the encrypted model-based method, the proposed fully model-free framework improves the control performance while reducing the communication and computation costs. More specifically, it reduces the communication between the system and the CaaS platform by 64% amount, and its computation time is 75% less than that of the model-based control.
Distributed Coordination for Multi-Vehicle Systems in the Presence of Misbehaving Vehicles
The coordination problem of multi-vehicle systems is of great interests in the area of autonomous driving and multi-vehicle control. This work mainly focuses on multi-task coordination problem of a group of vehicles with a bicycle model and some specific control objectives, including collision avoidance, connectivity maintenance and convergence to desired destinations. The basic idea is to develop a proper Lyapunov-like barrier function for all tasks and a distributed controller could be built in the presence of misbehaving vehicles. Control protocols are provided for both leader vehicle and follower vehicles. The simulation results demonstrate the effectiveness of proposed method.
comment: 13 pages, 5 figures, accepted by The 15th Asia Conference on Mechanical and Aerospace Engineering (ACMAE 2024)
Learning to Race in Extreme Turning Scene with Active Exploration and Gaussian Process Regression-based MPC
Extreme cornering in racing often induces large side-slip angles, presenting a formidable challenge in vehicle control. To tackle this issue, this paper introduces an Active Exploration with Double GPR (AEDGPR) system. The system initiates by planning a minimum-time trajectory with a Gaussian Process Regression(GPR) compensated model. The planning results show that in the cornering section, the yaw angular velocity and side-slip angle are in opposite directions, indicating that the vehicle is drifting. In response, we develop a drift controller based on Model Predictive Control (MPC) and incorporate Gaussian Process Regression to correct discrepancies in the vehicle dynamics model. Moreover, the covariance from the GPR is employed to actively explore various cornering states, aiming to minimize trajectory tracking errors. The proposed algorithm is validated through simulations on the Simulink-Carsim platform and experiments using a 1/10 scale RC vehicle.
Data Informativity for Quadratic Stabilization under Data Perturbation
Assessing data informativity, determining whether the measured data contains sufficient information for a specific control objective, is a fundamental challenge in data-driven control. In noisy scenarios, existing studies deal with system noise and measurement noise separately, using quadratic matrix inequalities. Moreover, the analysis of measurement noise requires restrictive assumptions on noise properties. To provide a unified framework without any restrictions, this study introduces data perturbation, a novel notion that encompasses both existing noise models. It is observed that the admissible system set with data perturbation does not meet preconditions necessary for applying the key lemma in the matrix S-procedure. Our analysis overcomes this limitation by developing an extended version of this lemma, making it applicable to data perturbation. Our results unify the existing analyses while eliminating the need for restrictive assumptions made in the measurement noise scenario.
comment: 8 pages
Long-Context Linear System Identification
This paper addresses the problem of long-context linear system identification, where the state $x_t$ of a dynamical system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$. We establish a sample complexity bound that matches the i.i.d. parametric rate up to logarithmic factors for a broad class of systems, extending previous works that considered only first-order dependencies. Our findings reveal a learning-without-mixing phenomenon, indicating that learning long-context linear autoregressive models is not hindered by slow mixing properties potentially associated with extended context windows. Additionally, we extend these results to (i) shared low-rank representations, where rank-regularized estimators improve rates with respect to dimensionality, and (ii) misspecified context lengths in strictly stable systems, where shorter contexts offer statistical advantages.
comment: 30 pages, 4 figures
Mobile IoT device for BPM monitoring people with heart problems
The developed system using a mobile electronic device for monitoring and warnings of heart problems, when the heart rate is outside the nominal range, which ranges from 60 to 100 beats per minute. Also, a system has been developed to save and monitor in real time changes of the cardiac pulsations, through a sensor connected to a control system. The connection of the communication module for Arduino GSM/GPRS/GPS, using the GPS network to locate the user. In addition, this device connects with GSM / GPRS technology that allows text messages to be sent to the contact number configured in the device, when warnings of heart problems are issued, moreover connects to the internet to store data in the cloud.
comment: 5 pages, 13 figures
Linear Convergence of Data-Enabled Policy Optimization for Linear Quadratic Tracking
Data-enabled policy optimization (DeePO) is a newly proposed method to attack the open problem of direct adaptive LQR. In this work, we extend the DeePO framework to the linear quadratic tracking (LQT) with offline data. By introducing a covariance parameterization of the LQT policy, we derive a direct data-driven formulation of the LQT problem. Then, we use gradient descent method to iteratively update the parameterized policy to find an optimal LQT policy. Moreover, by revealing the connection between DeePO and model-based policy optimization, we prove the linear convergence of the DeePO iteration. Finally, a numerical experiment is given to validate the convergence results. We hope our work paves the way to direct adaptive LQT with online closed-loop data.
comment: 6 pages, 1 figures, submitted to ACC 2025
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control
While MPC enables nonlinear feedback control by solving an optimal control problem at each timestep, the computational burden tends to be significantly large, making it difficult to optimize a policy within the control period. To address this issue, one possible approach is to utilize terminal value learning to reduce computational costs. However, the learned value cannot be used for other tasks in situations where the task dynamically changes in the original MPC setup. In this study, we develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization while reducing computational time. Furthermore, by using a hierarchical control structure that allows the upper-level trajectory planner to output appropriate goal-conditioned trajectories, we demonstrate that a robot model is able to generate diverse motions. We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control; thus, the robot successfully tracks a target trajectory on sloped terrain.
comment: 16 pages, 9 figures
Towards a Deeper Understanding of Transformer for Residential Non-intrusive Load Monitoring
Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted to analyze the influence of these hyper-parameters in the context of residential NILM. This study delves into the effects of the number of hidden dimensions in the attention layer, the number of attention layers, the number of attention heads, and the dropout ratio on transformer performance. Furthermore, the role of the masking ratio has explored in BERT-style transformer training, providing a detailed investigation into its impact on NILM tasks. Based on these experiments, the optimal hyper-parameters have been selected and used them to train a transformer model, which surpasses the performance of existing models. The experimental findings offer valuable insights and guidelines for optimizing transformer architectures, aiming to enhance their effectiveness and efficiency in NILM applications. It is expected that this work will serve as a foundation for future research and development of more robust and capable transformer models for NILM.
comment: Accepted to 2024 IEEE International Conference on Innovation in Science, Engineering and Technology (ICISET)
Direct Data-Driven Discrete-time Bilinear Biquadratic Regulator
We present a novel direct data-driven algorithm that learns an optimal control policy for the Bilinear Biquadratic Regulator (BBR) for an unknown bilinear system. The BBR is difficult to solve owing to the presence of the nonlinear biquadratic performance index and the bilinear cross-term in the dynamics. To address these difficulties, we apply several transformations on the state decision variables to obtain a nonlinear optimization problem with a linear performance index and affine (in the parameterized control) state-dependent equality. The adroit use of the Hamiltonian and Pontryagin's Minimum Principle allows us to derive a pair of first-order necessary conditions that, at each point in time, are easily solvable linear matrix equalities (LMEs) which give the optimal state-dependent control law. We then use the marginal sample autocorrelation of the collected data to obtain a direct data-driven equivalent of these LMEs. We demonstrate the performance of the proposed algorithm via illustrative numerical examples.
comment: 12 pages, 3 figure, Submitted to IEEE Control Systems Letters (L-CSS)
Nonlinear Model Predictive Control for Enhanced Path Tracking and Autonomous Drifting through Direct Yaw Moment Control and Rear-Wheel-Steering
Path tracking (PT) controllers capable of replicating race driving techniques, such as drifting beyond the limits of handling, have the potential of enhancing active safety in critical conditions. This paper presents a nonlinear model predictive control (NMPC) approach that integrates multiple actuation methods, namely four-wheel-steering, longitudinal tyre force distribution, and direct yaw moment control, to execute drifting when this is beneficial for PT in emergency scenarios. Simulation results of challenging manoeuvres, based on an experimentally validated vehicle model, highlight the substantial PT performance improvements brought by: i) vehicle operation outside the envelope enforced by the current generation of stability controllers; and ii) the integrated control of multiple actuators.
comment: 7 pages, 2 figures, published in the 16th International Symposium on Advanced Vehicle Control. AVEC 2024. Lecture Notes in Mechanical Engineering. Springer, Cham, pp. 854 861, 2024
Approximate non-linear model predictive control with safety-augmented neural networks
Model predictive control (MPC) achieves stability and constraint satisfaction for general nonlinear systems, but requires computationally expensive online optimization. This paper studies approximations of such MPC controllers via neural networks (NNs) to achieve fast online evaluation. We propose safety augmentation that yields deterministic guarantees for convergence and constraint satisfaction despite approximation inaccuracies. We approximate the entire input sequence of the MPC with NNs, which allows us to verify online if it is a feasible solution to the MPC problem. We replace the NN solution by a safe candidate based on standard MPC techniques whenever it is infeasible or has worse cost. Our method requires a single evaluation of the NN and forward integration of the input sequence online, which is fast to compute on resource-constrained systems. The proposed control framework is illustrated using two numerical non-linear MPC benchmarks of different complexity, demonstrating computational speedups that are orders of magnitude higher than online optimization. In the examples, we achieve deterministic safety through the safety-augmented NNs, where a naive NN implementation fails.
Deterministic Trajectory Optimization through Probabilistic Optimal Control
This article proposes two new algorithms tailored to discrete-time deterministic finite-horizon nonlinear optimal control problems or so-called trajectory optimization problems. Both algorithms are inspired by a novel theoretical paradigm known as probabilistic optimal control, that reformulates optimal control as an equivalent probabilistic inference problem. This perspective allows to address the problem using the Expectation-Maximization algorithm. We show that the application of this algorithm results in a fixed point iteration of probabilistic policies that converge to the deterministic optimal policy. Two strategies for policy evaluation are discussed, using state-of-the-art uncertainty quantification methods resulting into two distinct algorithms. The algorithms are structurally closest related to the differential dynamic programming algorithm and related methods that use sigma-point methods to avoid direct gradient evaluations. The main advantage of our work is an improved balance between exploration and exploitation over the iterations, leading to improved numerical stability and accelerated convergence. These properties are demonstrated on different nonlinear systems.
Differentially Private Distributed Nonconvex Stochastic Optimization with Quantized Communication
This paper proposes a new distributed nonconvex stochastic optimization algorithm that can achieve privacy protection, communication efficiency and convergence simultaneously. Specifically, each node adds general privacy noises to its local state to avoid information leakage, and then quantizes its noise-perturbed state before transmitting to improve communication efficiency. By using a subsampling method controlled through the sample-size parameter, the proposed algorithm reduces cumulative differential privacy parameters $\epsilon$, $\delta$, and thus enhances the differential privacy level, which is significantly different from the existing works. By using a two-time-scale step-sizes method, the mean square convergence for nonconvex cost functions is given. Furthermore, when the global cost function satisfies the Polyak-{\L}ojasiewicz condition, the convergence rate and the oracle complexity of the proposed algorithm are given. In addition, the proposed algorithm achieves both the mean square convergence and finite cumulative differential privacy parameters $\epsilon$, $\delta$ over infinite iterations as the sample-size goes to infinity. A numerical example of the distributed training on the ``MNIST'' dataset is given to show the effectiveness of the algorithm.
Shock waves in nonlinear transmission lines
In the first half of the paper we consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate the "sound" wave coefficient of reflection from (coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the speeds of the "sound" waves relative to the shock and the wave impedances. In the second half of the paper we explicitly include into consideration the dissipation in the system, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to justify the conditions on the shocks, postulated in the first half of the paper. This also allows us to describe the shocks as physical objects of finite width and study their profiles, same as the profiles of the waves closely connected with the shocks - the kinks. The profiles of the latter, and in some particular cases the profiles of the former, were obtained in terms of elementary functions.
comment: pdfLaTeX, 8 pages, 4 figures. As published in Physica Status Solidi B DOI: 10.1002/pssb.202400335
Trustworthy V2G scheduling and energy trading: A blockchain-based framework
The rapid growth of electric vehicles (EVs) and the deployment of vehicle-to-grid (V2G) technology pose significant challenges for distributed power grids, particularly in fostering trust and ensuring effective coordination among stakeholders. Establishing a trustworthy V2G operation environment is crucial for enabling large-scale EV user participation and realizing V2G potential in real-world applications. In this paper, an integrated scheduling and trading framework is developed to conduct transparent and efficacious coordination in V2G operations. In blockchain implementation, a cyber-physical blockchain architecture is proposed to enhance transaction efficiency and scalability by leveraging smart charging points (SCPs) for rapid transaction validation through a fast-path practical byzantine fault tolerance (fast-path PBFT) consensus mechanism. From the energy dispatching perspective, a game-theoretical pricing strategy is employed and smart contracts are utilized for autonomous decision-making between EVs and operators, aiming to optimize the trading process and maximize economic benefits. Numerical evaluation of blockchain consensus shows the effect of the fast-path PBFT consensus in improving systems scalability with a balanced trade-off in robustness. A case study, utilizing real-world data from the Southern University of Science and Technology (SUSTech), demonstrates significant reductions in EV charging costs and the framework potential to support auxiliary grid services.
Probabilistic Load Forecasting of Distribution Power Systems based on Empirical Copulas
Accurate and reliable electricity load forecasts are becoming increasingly important as the share of intermittent resources in the system increases. Distribution System Operators (DSOs) are called to accurately forecast their production and consumption to place optimal bids in the day-ahead market. Forecasts must account for the volatility of weather-parameters that impacts both the production and consumption of electricity. If DSO-loads are small or lower-granularity forecasts are needed, parametric statistical methods may fail to provide reliable performance since they rely on a priori statistical distributions of the variables to forecast. In this paper, we introduce a Probabilistic Load Forecast (PLF) method based on Empirical Copulas (ECs). The model is datadriven, does not need a priori assumption on parametric distribution for variables, nor the dependence structure (copula). It employs a kernel density estimate of the underlying distribution using beta kernels that have bounded support on the unit hypercube. The method naturally supports variables with widely different distributions, such as weather data (including forecasted ones) and historic electricity consumption, and produces a conditional probability distribution for every time step in the forecast, which allows inferring the quantiles of interest. The proposed non-parametric approach differs significantly from previous forecasting methods based on copulas, which typically uses copulas to model hierarchical dependence. The bandwidth of the beta kernel density estimators is optimized using Integrated Square Error (ISE). We present results from an open dataset and showcase the strength of the model with respect to Quantile Regression (QR) using standard probabilistic evaluation metrics.
comment: Submitted to Sustainable Energy, Grids and Networks (SEGAN), October 8, 2024
Systems and Control (EESS)
Embedded State Estimation for Optimization of Cislunar Space Domain Awareness Constellation Design
The traffic in cislunar space is expected to increase over the coming years, leading to a higher likelihood of conjunction events among active satellites, orbital debris, and non-cooperative satellites. This increase necessitates enhanced space domain awareness (SDA) capabilities that include state estimation for targets of interest. Both Earth surface-based and space-based observation platforms in geosynchronous orbit or below face challenges such as range, exclusion, and occlusion that hinder observation. Motivated by the need to place space-based observers in the cislunar space regime to overcome these challenges, this paper proposes a cislunar SDA constellation design and analysis framework that integrates state estimation into an optimization problem for determining the placement of observers for optimal state estimation performance on a set of targets. The proposed multi-observer placement optimization problem samples from a range of possible target orbits. Upon convergence, the optimized constellation is validated against a broader set of targets to assess its effectiveness. Two comparative analyses are presented to evaluate the effects of changes in the sensor tasking procedure and sensor fidelity on the optimized constellation, comparing these to a single observer baseline case. The results demonstrate that the optimized constellations can provide accurate state estimation for various orbit families.
comment: 36 pages, 14 figures, Journal of Spacecraft and Rockets (accepted)
Work-in-Progress: Traded Control Transfer for Managing Real-Time Sensor Uncertainties in Autonomous Vehicle
At Levels 2 and 3 of autonomous driving defined by the Society of Auto-motive Engineers, drivers must take on certain driving responsibilities, and automated driving must sometimes yield to human control. This situation can occur in real time due to uncertainties in sensor measurements caused by environmental factors like fog or smoke. To address this challenge, we propose a method to manage real-time sensor uncertainties in autonomous vehicles by monitoring sensor conflicts and dynamically adjusting control authority to maintain safe operation. However, to achieve this, we have introduced a novel metric called the Degree of Conflicts (DoC), which quantifies the conflict between real-time sensor data by measuring the differences between data from multiple sensors. Our approach aims to demonstrate the importance of selecting an appropriate DoC threshold for transferring control between the automation agent and the human driver. The results have shown that choosing the correct DoC threshold can enhance safety by promptly handing over the driving control from the automation system to the human driver in challenging conditions.
comment: Peer-reviewed and accepted by the 2024 IEEE Real-Time Systems Symposium (RTSS)
Meta-Learning Augmented MPC for Disturbance-Aware Motion Planning and Control of Quadrotors
A major challenge in autonomous flights is unknown disturbances, which can jeopardize safety and lead to collisions, especially in obstacle-rich environments. This paper presents a disturbance-aware motion planning and control framework designed for autonomous aerial flights. The framework is composed of two key components: a disturbance-aware motion planner and a tracking controller. The disturbance-aware motion planner consists of a predictive control scheme and a learned model of disturbances that is adapted online. The tracking controller is designed using contraction control methods to provide safety bounds on the quadrotor behaviour in the vicinity of the obstacles with respect to the disturbance-aware motion plan. Finally, the algorithm is tested in simulation scenarios with a quadrotor facing strong crosswind and ground-induced disturbances.
An Algorithm for Distributed Computation of Reachable Sets for Multi-Agent Systems
In this paper, we consider the problem of distributed reachable set computation for multi-agent systems (MASs) interacting over an undirected, stationary graph. A full state-feedback control input for such MASs depends no only on the current agent's state, but also of its neighbors. However, in most MAS applications, the dynamics are obscured by individual agents. This makes reachable set computation, in a fully distributed manner, a challenging problem. We utilize the ideas of polytopic reachable set approximation and generalize it to a MAS setup. We formulate the resulting sub-problems in a fully distributed manner and provide convergence guarantees for the associated computations. The proposed algorithm's convergence is proved for two cases: static MAS graphs, and time-varying graphs under certain restrictions.
comment: 10 pages, 4 figures, 1 algorithm float. Preprint submitted to ACC 2025 for review
A Generalized Metriplectic System via Free Energy and System~Identification via Bilevel Convex Optimization
This work generalizes the classical metriplectic formalism to model Hamiltonian systems with nonconservative dissipation. Classical metriplectic representations allow for the description of energy conservation and production of entropy via a suitable selection of an entropy function and a bilinear symmetric metric. By relaxing the Casimir invariance requirement of the entropy function, this paper shows that the generalized formalism induces the free energy analogous to thermodynamics. The monotonic change of free energy can serve as a more precise criterion than mechanical energy or entropy alone. This paper provides examples of the generalized metriplectic system in a 2-dimensional Hamiltonian system and $\mathrm{SO}(3)$. This paper also provides a bilevel convex optimization approach for the identification of the metriplectic system given measurements of the system.
QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers
Queuing network control determines the allocation of scarce resources to manage congestion, a fundamental problem in manufacturing, communications, and healthcare. Compared to standard RL problems, queueing problems are distinguished by unique challenges: i) a system operating in continuous time, ii) high stochasticity, and iii) long horizons over which the system can become unstable (exploding delays). To spur methodological progress tackling these challenges, we present an open-sourced queueing simulation framework, QGym, that benchmark queueing policies across realistic problem instances. Our modular framework allows the researchers to build on our initial instances, which provide a wide range of environments including parallel servers, criss-cross, tandem, and re-entrant networks, as well as a realistically calibrated hospital queuing system. QGym makes it easy to compare multiple policies, including both model-free RL methods and classical queuing policies. Our testbed complements the traditional focus on evaluating algorithms based on mathematical guarantees in idealized settings, and significantly expands the scope of empirical benchmarking in prior work. QGym code is open-sourced at https://github.com/namkoong-lab/QGym.
Classification of simulation relations for symbolic control
Abstraction-based control design is a promising approach for ensuring safety-critical control of complex cyber-physical systems. A key aspect of this methodology is the relation between the original and abstract systems, which ensures that the abstract controller can be transformed into a valid controller for the original system through a concretization procedure. In this paper, we provide a comprehensive and systematic framework that characterizes various simulation relations, through their associated concretization procedures. We introduce the concept of augmented system, which universally enables a feedback refinement relation with the abstract system. This augmented system encapsulates the specific characteristics of each simulation relation within an interface, enabling a plug-and-play control architecture. Our results demonstrate that the existence of a particular simulation relation between the concrete and abstract systems is equivalent to the implementability of a specific control architecture, which depends on the considered simulation relation. This allows us to introduce new types of relations, and to establish the advantages and drawbacks of different relations, which we exhibit through detailed examples.
comment: 14 pages, 6 figures
Nationally Scalable Hydrogen Fueling Infrastructure Deployment: A Megaregion Analysis and Optimization Approach
Decarbonizing regional and long-haul freight faces challenges due to the limitations of battery-electric vehicles and infrastructure. Hydrogen fuel cell medium- and heavy-duty vehicles (MHDVs) present a promising alternative, aligning with the Department of Energy's decarbonization goals. Historically, alternative fuels like compressed natural gas and propane gas have seen slow adoption due to infrastructure barriers. To prevent similar setbacks, planning for zero-emission hydrogen fueling infrastructure is critical. This research develops plans for affordable and accessible hydrogen refueling stations, supporting the decarbonized freight system and benefiting underserved and rural communities by improving air quality, reducing noise pollution, and enhancing energy resilience.It provides a blueprint for replacing diesel in Class 8 trucks with hydrogen fueling solutions, focusing on the Texas Triangle Megaregion (I-45, I-35, I-10), the I-10 corridor between San Antonio, TX, and Los Angeles, CA, and the I-5/CA-99 corridors between Los Angeles and San Francisco. This area accounts for ~8.5% of U.S. heavy-duty freight volume. Using the OR-AGENT (Optimal Regional Architecture Generation for Electrified National Transport) framework, the study analyzes vehicles, freight networks, and energy systems. The framework integrates data on freight mobility, traffic, weather, and energy pathways to deliver optimized powertrain architectures and hydrogen fueling infrastructure deployment. It assesses all vehicle origin-destination pairs and feasible fueling station locations, using a genetic algorithm to identify the minimum number and optimal locations of hydrogen stations. It also determines fuel schedules and quantities, ensuring no vehicle is stranded. A deployment roadmap outlines strategic hydrogen refueling infrastructure rollout across multiple adoption scenarios.
Characterization of input-to-output stability for infinite dimensional systems
We prove a superposition theorem for input-to-output stability (IOS) of a broad class of nonlinear infinite-dimensional systems with outputs including both continuous-time and discrete-time systems. It contains, as a special case, the superposition theorem for input-to-state stability (ISS) of infinite-dimensional systems from [1] and the IOS superposition theorem for systems of ordinary differential equations from [2]. To achieve this result, we introduce and examine several novel stability and attractivity concepts for infinite dimensional systems with outputs: We prove criteria for the uniform limit property for systems with outputs, several of which are new already for systems with full-state output, we provide superposition theorems for systems which satisfy both the output-Lagrange stability property (OL) and IOS, give a sufficient condition for OL and characterize ISS in terms of IOS and input/output-to-state stability. Finally, by means of counterexamples, we illustrate the challenges appearing on the way of extension of the superposition theorems from [1] and [2] to infinite-dimensional systems with outputs.
Spectrally Efficient LDPC Codes For IRIG-106 Waveforms via Random Puncturing
Low-density parity-check (LDPC) codes form part of the IRIG-106 standard and have been successfully deployed for the Telemetry Group version of shaped-offset quadrature phase shift keying (SOQPSK-TG) modulation. Recently, LDPC code solutions have been proposed and optimized for continuous phase modulations (CPMs), including the pulse code modulation/frequency modulation (PCM/FM) and the multi-h CPM developed by the Advanced Range TeleMetry program (ARTM CPM). These codes were shown to perform around one dB from the respective channel capacities of these modulations. In this paper, we consider the effect of random puncturing of these LDPC codes to further improve spectrum efficiency. We present numerical simulation results that affirm the robust decoding performance promised by LDPC codes designed for ARTM CPM.
comment: Accepted for inclusion in the 2024 International Telemetry Conference
Privacy-aware Fully Model-Free Event-triggered Cloud-based HVAC Control
Privacy is a major concern when computing-as-a-service (CaaS) platforms, e.g., cloud-computing platforms, are utilized for building automation, as CaaS platforms can infer sensitive information, such as occupancy, using the sensor measurements of a building. Although the existing encrypted model-based control algorithms can ensure the security and privacy of sensor measurements, they are highly complex to implement and require high computational resources, which result in a high cost of using CaaS platforms. To address these issues, in this paper, we propose an encrypted fully model-free event-triggered cloud-based HVAC control framework that ensures the privacy of occupancy information and minimizes the communication and computation overhead associated with encrypted HVAC control. To this end, we first develop a model-free controller for regulating indoor temperature and CO2 levels. We then design a model-free event-triggering unit which reduces the communication and computation costs of encrypted HVAC control using an optimal triggering policy. Finally, we evaluate the performance of the proposed encrypted fully model-free event-triggered cloud-based HVAC control framework using the TRNSYS simulator, comparing it to an encrypted model-based event-triggered control framework, which uses model predictive control to regulate the indoor climate. Our numerical results demonstrate that, compared to the encrypted model-based method, the proposed fully model-free framework improves the control performance while reducing the communication and computation costs. More specifically, it reduces the communication between the system and the CaaS platform by 64% amount, and its computation time is 75% less than that of the model-based control.
Distributed Coordination for Multi-Vehicle Systems in the Presence of Misbehaving Vehicles
The coordination problem of multi-vehicle systems is of great interests in the area of autonomous driving and multi-vehicle control. This work mainly focuses on multi-task coordination problem of a group of vehicles with a bicycle model and some specific control objectives, including collision avoidance, connectivity maintenance and convergence to desired destinations. The basic idea is to develop a proper Lyapunov-like barrier function for all tasks and a distributed controller could be built in the presence of misbehaving vehicles. Control protocols are provided for both leader vehicle and follower vehicles. The simulation results demonstrate the effectiveness of proposed method.
comment: 13 pages, 5 figures, accepted by The 15th Asia Conference on Mechanical and Aerospace Engineering (ACMAE 2024)
Learning to Race in Extreme Turning Scene with Active Exploration and Gaussian Process Regression-based MPC
Extreme cornering in racing often induces large side-slip angles, presenting a formidable challenge in vehicle control. To tackle this issue, this paper introduces an Active Exploration with Double GPR (AEDGPR) system. The system initiates by planning a minimum-time trajectory with a Gaussian Process Regression(GPR) compensated model. The planning results show that in the cornering section, the yaw angular velocity and side-slip angle are in opposite directions, indicating that the vehicle is drifting. In response, we develop a drift controller based on Model Predictive Control (MPC) and incorporate Gaussian Process Regression to correct discrepancies in the vehicle dynamics model. Moreover, the covariance from the GPR is employed to actively explore various cornering states, aiming to minimize trajectory tracking errors. The proposed algorithm is validated through simulations on the Simulink-Carsim platform and experiments using a 1/10 scale RC vehicle.
Data Informativity for Quadratic Stabilization under Data Perturbation
Assessing data informativity, determining whether the measured data contains sufficient information for a specific control objective, is a fundamental challenge in data-driven control. In noisy scenarios, existing studies deal with system noise and measurement noise separately, using quadratic matrix inequalities. Moreover, the analysis of measurement noise requires restrictive assumptions on noise properties. To provide a unified framework without any restrictions, this study introduces data perturbation, a novel notion that encompasses both existing noise models. It is observed that the admissible system set with data perturbation does not meet preconditions necessary for applying the key lemma in the matrix S-procedure. Our analysis overcomes this limitation by developing an extended version of this lemma, making it applicable to data perturbation. Our results unify the existing analyses while eliminating the need for restrictive assumptions made in the measurement noise scenario.
comment: 8 pages
Long-Context Linear System Identification
This paper addresses the problem of long-context linear system identification, where the state $x_t$ of a dynamical system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$. We establish a sample complexity bound that matches the i.i.d. parametric rate up to logarithmic factors for a broad class of systems, extending previous works that considered only first-order dependencies. Our findings reveal a learning-without-mixing phenomenon, indicating that learning long-context linear autoregressive models is not hindered by slow mixing properties potentially associated with extended context windows. Additionally, we extend these results to (i) shared low-rank representations, where rank-regularized estimators improve rates with respect to dimensionality, and (ii) misspecified context lengths in strictly stable systems, where shorter contexts offer statistical advantages.
comment: 30 pages, 4 figures
Mobile IoT device for BPM monitoring people with heart problems
The developed system using a mobile electronic device for monitoring and warnings of heart problems, when the heart rate is outside the nominal range, which ranges from 60 to 100 beats per minute. Also, a system has been developed to save and monitor in real time changes of the cardiac pulsations, through a sensor connected to a control system. The connection of the communication module for Arduino GSM/GPRS/GPS, using the GPS network to locate the user. In addition, this device connects with GSM / GPRS technology that allows text messages to be sent to the contact number configured in the device, when warnings of heart problems are issued, moreover connects to the internet to store data in the cloud.
comment: 5 pages, 13 figures
Linear Convergence of Data-Enabled Policy Optimization for Linear Quadratic Tracking
Data-enabled policy optimization (DeePO) is a newly proposed method to attack the open problem of direct adaptive LQR. In this work, we extend the DeePO framework to the linear quadratic tracking (LQT) with offline data. By introducing a covariance parameterization of the LQT policy, we derive a direct data-driven formulation of the LQT problem. Then, we use gradient descent method to iteratively update the parameterized policy to find an optimal LQT policy. Moreover, by revealing the connection between DeePO and model-based policy optimization, we prove the linear convergence of the DeePO iteration. Finally, a numerical experiment is given to validate the convergence results. We hope our work paves the way to direct adaptive LQT with online closed-loop data.
comment: 6 pages, 1 figures, submitted to ACC 2025
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control
While MPC enables nonlinear feedback control by solving an optimal control problem at each timestep, the computational burden tends to be significantly large, making it difficult to optimize a policy within the control period. To address this issue, one possible approach is to utilize terminal value learning to reduce computational costs. However, the learned value cannot be used for other tasks in situations where the task dynamically changes in the original MPC setup. In this study, we develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization while reducing computational time. Furthermore, by using a hierarchical control structure that allows the upper-level trajectory planner to output appropriate goal-conditioned trajectories, we demonstrate that a robot model is able to generate diverse motions. We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control; thus, the robot successfully tracks a target trajectory on sloped terrain.
comment: 16 pages, 9 figures
Towards a Deeper Understanding of Transformer for Residential Non-intrusive Load Monitoring
Transformer models have demonstrated impressive performance in Non-Intrusive Load Monitoring (NILM) applications in recent years. Despite their success, existing studies have not thoroughly examined the impact of various hyper-parameters on model performance, which is crucial for advancing high-performing transformer models. In this work, a comprehensive series of experiments have been conducted to analyze the influence of these hyper-parameters in the context of residential NILM. This study delves into the effects of the number of hidden dimensions in the attention layer, the number of attention layers, the number of attention heads, and the dropout ratio on transformer performance. Furthermore, the role of the masking ratio has explored in BERT-style transformer training, providing a detailed investigation into its impact on NILM tasks. Based on these experiments, the optimal hyper-parameters have been selected and used them to train a transformer model, which surpasses the performance of existing models. The experimental findings offer valuable insights and guidelines for optimizing transformer architectures, aiming to enhance their effectiveness and efficiency in NILM applications. It is expected that this work will serve as a foundation for future research and development of more robust and capable transformer models for NILM.
comment: Accepted to 2024 IEEE International Conference on Innovation in Science, Engineering and Technology (ICISET)
Direct Data-Driven Discrete-time Bilinear Biquadratic Regulator
We present a novel direct data-driven algorithm that learns an optimal control policy for the Bilinear Biquadratic Regulator (BBR) for an unknown bilinear system. The BBR is difficult to solve owing to the presence of the nonlinear biquadratic performance index and the bilinear cross-term in the dynamics. To address these difficulties, we apply several transformations on the state decision variables to obtain a nonlinear optimization problem with a linear performance index and affine (in the parameterized control) state-dependent equality. The adroit use of the Hamiltonian and Pontryagin's Minimum Principle allows us to derive a pair of first-order necessary conditions that, at each point in time, are easily solvable linear matrix equalities (LMEs) which give the optimal state-dependent control law. We then use the marginal sample autocorrelation of the collected data to obtain a direct data-driven equivalent of these LMEs. We demonstrate the performance of the proposed algorithm via illustrative numerical examples.
comment: 12 pages, 3 figure, Submitted to IEEE Control Systems Letters (L-CSS)
Nonlinear Model Predictive Control for Enhanced Path Tracking and Autonomous Drifting through Direct Yaw Moment Control and Rear-Wheel-Steering
Path tracking (PT) controllers capable of replicating race driving techniques, such as drifting beyond the limits of handling, have the potential of enhancing active safety in critical conditions. This paper presents a nonlinear model predictive control (NMPC) approach that integrates multiple actuation methods, namely four-wheel-steering, longitudinal tyre force distribution, and direct yaw moment control, to execute drifting when this is beneficial for PT in emergency scenarios. Simulation results of challenging manoeuvres, based on an experimentally validated vehicle model, highlight the substantial PT performance improvements brought by: i) vehicle operation outside the envelope enforced by the current generation of stability controllers; and ii) the integrated control of multiple actuators.
comment: 7 pages, 2 figures, published in the 16th International Symposium on Advanced Vehicle Control. AVEC 2024. Lecture Notes in Mechanical Engineering. Springer, Cham, pp. 854 861, 2024
Approximate non-linear model predictive control with safety-augmented neural networks
Model predictive control (MPC) achieves stability and constraint satisfaction for general nonlinear systems, but requires computationally expensive online optimization. This paper studies approximations of such MPC controllers via neural networks (NNs) to achieve fast online evaluation. We propose safety augmentation that yields deterministic guarantees for convergence and constraint satisfaction despite approximation inaccuracies. We approximate the entire input sequence of the MPC with NNs, which allows us to verify online if it is a feasible solution to the MPC problem. We replace the NN solution by a safe candidate based on standard MPC techniques whenever it is infeasible or has worse cost. Our method requires a single evaluation of the NN and forward integration of the input sequence online, which is fast to compute on resource-constrained systems. The proposed control framework is illustrated using two numerical non-linear MPC benchmarks of different complexity, demonstrating computational speedups that are orders of magnitude higher than online optimization. In the examples, we achieve deterministic safety through the safety-augmented NNs, where a naive NN implementation fails.
Deterministic Trajectory Optimization through Probabilistic Optimal Control
This article proposes two new algorithms tailored to discrete-time deterministic finite-horizon nonlinear optimal control problems or so-called trajectory optimization problems. Both algorithms are inspired by a novel theoretical paradigm known as probabilistic optimal control, that reformulates optimal control as an equivalent probabilistic inference problem. This perspective allows to address the problem using the Expectation-Maximization algorithm. We show that the application of this algorithm results in a fixed point iteration of probabilistic policies that converge to the deterministic optimal policy. Two strategies for policy evaluation are discussed, using state-of-the-art uncertainty quantification methods resulting into two distinct algorithms. The algorithms are structurally closest related to the differential dynamic programming algorithm and related methods that use sigma-point methods to avoid direct gradient evaluations. The main advantage of our work is an improved balance between exploration and exploitation over the iterations, leading to improved numerical stability and accelerated convergence. These properties are demonstrated on different nonlinear systems.
Differentially Private Distributed Nonconvex Stochastic Optimization with Quantized Communication
This paper proposes a new distributed nonconvex stochastic optimization algorithm that can achieve privacy protection, communication efficiency and convergence simultaneously. Specifically, each node adds general privacy noises to its local state to avoid information leakage, and then quantizes its noise-perturbed state before transmitting to improve communication efficiency. By using a subsampling method controlled through the sample-size parameter, the proposed algorithm reduces cumulative differential privacy parameters $\epsilon$, $\delta$, and thus enhances the differential privacy level, which is significantly different from the existing works. By using a two-time-scale step-sizes method, the mean square convergence for nonconvex cost functions is given. Furthermore, when the global cost function satisfies the Polyak-{\L}ojasiewicz condition, the convergence rate and the oracle complexity of the proposed algorithm are given. In addition, the proposed algorithm achieves both the mean square convergence and finite cumulative differential privacy parameters $\epsilon$, $\delta$ over infinite iterations as the sample-size goes to infinity. A numerical example of the distributed training on the ``MNIST'' dataset is given to show the effectiveness of the algorithm.
Shock waves in nonlinear transmission lines
In the first half of the paper we consider interaction between the small amplitude travelling waves ("sound") and the shock waves in the transmission line containing both nonlinear capacitors and nonlinear inductors. We calculate the "sound" wave coefficient of reflection from (coefficient of transmission through) the shock wave. These coefficients are expressed in terms of the speeds of the "sound" waves relative to the shock and the wave impedances. In the second half of the paper we explicitly include into consideration the dissipation in the system, introducing ohmic resistors shunting the inductors and also in series with the capacitors. This allows us to justify the conditions on the shocks, postulated in the first half of the paper. This also allows us to describe the shocks as physical objects of finite width and study their profiles, same as the profiles of the waves closely connected with the shocks - the kinks. The profiles of the latter, and in some particular cases the profiles of the former, were obtained in terms of elementary functions.
comment: pdfLaTeX, 8 pages, 4 figures. As published in Physica Status Solidi B DOI: 10.1002/pssb.202400335
Trustworthy V2G scheduling and energy trading: A blockchain-based framework
The rapid growth of electric vehicles (EVs) and the deployment of vehicle-to-grid (V2G) technology pose significant challenges for distributed power grids, particularly in fostering trust and ensuring effective coordination among stakeholders. Establishing a trustworthy V2G operation environment is crucial for enabling large-scale EV user participation and realizing V2G potential in real-world applications. In this paper, an integrated scheduling and trading framework is developed to conduct transparent and efficacious coordination in V2G operations. In blockchain implementation, a cyber-physical blockchain architecture is proposed to enhance transaction efficiency and scalability by leveraging smart charging points (SCPs) for rapid transaction validation through a fast-path practical byzantine fault tolerance (fast-path PBFT) consensus mechanism. From the energy dispatching perspective, a game-theoretical pricing strategy is employed and smart contracts are utilized for autonomous decision-making between EVs and operators, aiming to optimize the trading process and maximize economic benefits. Numerical evaluation of blockchain consensus shows the effect of the fast-path PBFT consensus in improving systems scalability with a balanced trade-off in robustness. A case study, utilizing real-world data from the Southern University of Science and Technology (SUSTech), demonstrates significant reductions in EV charging costs and the framework potential to support auxiliary grid services.
Probabilistic Load Forecasting of Distribution Power Systems based on Empirical Copulas
Accurate and reliable electricity load forecasts are becoming increasingly important as the share of intermittent resources in the system increases. Distribution System Operators (DSOs) are called to accurately forecast their production and consumption to place optimal bids in the day-ahead market. Forecasts must account for the volatility of weather-parameters that impacts both the production and consumption of electricity. If DSO-loads are small or lower-granularity forecasts are needed, parametric statistical methods may fail to provide reliable performance since they rely on a priori statistical distributions of the variables to forecast. In this paper, we introduce a Probabilistic Load Forecast (PLF) method based on Empirical Copulas (ECs). The model is datadriven, does not need a priori assumption on parametric distribution for variables, nor the dependence structure (copula). It employs a kernel density estimate of the underlying distribution using beta kernels that have bounded support on the unit hypercube. The method naturally supports variables with widely different distributions, such as weather data (including forecasted ones) and historic electricity consumption, and produces a conditional probability distribution for every time step in the forecast, which allows inferring the quantiles of interest. The proposed non-parametric approach differs significantly from previous forecasting methods based on copulas, which typically uses copulas to model hierarchical dependence. The bandwidth of the beta kernel density estimators is optimized using Integrated Square Error (ISE). We present results from an open dataset and showcase the strength of the model with respect to Quantile Regression (QR) using standard probabilistic evaluation metrics.
comment: Submitted to Sustainable Energy, Grids and Networks (SEGAN), October 8, 2024
Robotics
Proprioceptive State Estimation for Quadruped Robots using Invariant Kalman Filtering and Scale-Variant Robust Cost Functions
Accurate state estimation is crucial for legged robot locomotion, as it provides the necessary information to allow control and navigation. However, it is also challenging, especially in scenarios with uneven and slippery terrain. This paper presents a new Invariant Extended Kalman filter for legged robot state estimation using only proprioceptive sensors. We formulate the methodology by combining recent advances in state estimation theory with the use of robust cost functions in the measurement update. We tested our methodology on quadruped robots through experiments and public datasets, showing that we can obtain a pose drift up to 40% lower in trajectories covering a distance of over 450m, in comparison with a state-of-the-art Invariant Extended Kalman filter.
comment: Accepted to the IEEE-RAS International Conference on Humanoid Robots 2024
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control
We consider deep deterministic policy gradient (DDPG) in the context of reinforcement learning with sparse rewards. To enhance exploration, we introduce a search procedure, \emph{${\epsilon}{t}$-greedy}, which generates exploratory options for exploring less-visited states. We prove that search using $\epsilon t$-greedy has polynomial sample complexity under mild MDP assumptions. To more efficiently use the information provided by rewarded transitions, we develop a new dual experience replay buffer framework, \emph{GDRB}, and implement \emph{longest n-step returns}. The resulting algorithm, \emph{ETGL-DDPG}, integrates all three techniques: \bm{$\epsilon t$}-greedy, \textbf{G}DRB, and \textbf{L}ongest $n$-step, into DDPG. We evaluate ETGL-DDPG on standard benchmarks and demonstrate that it outperforms DDPG, as well as other state-of-the-art methods, across all tested sparse-reward continuous environments. Ablation studies further highlight how each strategy individually enhances the performance of DDPG in this setting.
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Building on the advancements of Large Language Models (LLMs) and Vision Language Models (VLMs), recent research has introduced Vision-Language-Action (VLA) models as an integrated solution for robotic manipulation tasks. These models take camera images and natural language task instructions as input and directly generate control actions for robots to perform specified tasks, greatly improving both decision-making capabilities and interaction with human users. However, the data-driven nature of VLA models, combined with their lack of interpretability, makes the assurance of their effectiveness and robustness a challenging task. This highlights the need for a reliable testing and evaluation platform. For this purpose, in this work, we propose LADEV, a comprehensive and efficient platform specifically designed for evaluating VLA models. We first present a language-driven approach that automatically generates simulation environments from natural language inputs, mitigating the need for manual adjustments and significantly improving testing efficiency. Then, to further assess the influence of language input on the VLA models, we implement a paraphrase mechanism that produces diverse natural language task instructions for testing. Finally, to expedite the evaluation process, we introduce a batch-style method for conducting large-scale testing of VLA models. Using LADEV, we conducted experiments on several state-of-the-art VLA models, demonstrating its effectiveness as a tool for evaluating these models. Our results showed that LADEV not only enhances testing efficiency but also establishes a solid baseline for evaluating VLA models, paving the way for the development of more intelligent and advanced robotic systems.
comment: 8 pages, 4 figures
State Estimation of Marine Vessels Affected by Waves by Unmanned Aerial Vehicles
A novel approach for robust state estimation of marine vessels in rough water is proposed in this paper to enable tight collaboration between Unmanned Aerial Vehicles (UAVs) and a marine vessel, such as cooperative landing or object manipulation, regardless of weather conditions. Our study of marine vessel (in our case Unmanned Surface Vehicle (USV)) dynamics influenced by strong wave motion has resulted in a novel nonlinear mathematical USV model with 6 degrees of freedom (DOFs), which is required for precise USV state estimation and motion prediction. The proposed state estimation approach fuses data from multiple sensors onboard the UAV and the USV to enable redundancy and robustness under varying weather conditions of real-world applications. The proposed approach provides estimated states of the USV with 6 DOFs and predicts its future states to enable tight control of both vehicles on a receding control horizon. The proposed approach was extensively tested in the realistic Gazebo simulator and successfully experimentally validated in many real-world experiments representing different application scenarios, including agile landing on an oscillating and moving USV. A comparative study indicates that the proposed approach significantly surpassed the current state-of-the-art.
MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space Terrain ECCV 2024
The visual detection and tracking of surface terrain is required for spacecraft to safely land on or navigate within close proximity to celestial objects. Current approaches rely on template matching with pre-gathered patch-based features, which are expensive to obtain and a limiting factor in perceptual capability. While recent literature has focused on in-situ detection methods to enhance navigation and operational autonomy, robust description is still needed. In this work, we explore metric learning as the lightweight feature description mechanism and find that current solutions fail to address inter-class similarity and multi-view observational geometry. We attribute this to the view-unaware attention mechanism and introduce Multi-view Attention Regularizations (MARs) to constrain the channel and spatial attention across multiple feature views, regularizing the what and where of attention focus. We thoroughly analyze many modern metric learning losses with and without MARs and demonstrate improved terrain-feature recognition performance by upwards of 85%. We additionally introduce the Luna-1 dataset, consisting of Moon crater landmarks and reference navigation frames from NASA mission data to support future research in this difficult task. Luna-1 and source code are publicly available at https://droneslab.github.io/mars/.
comment: ECCV 2024. Project page available at https://droneslab.github.io/mars/
Real-Time Truly-Coupled Lidar-Inertial Motion Correction and Spatiotemporal Dynamic Object Detection IROS
Over the past decade, lidars have become a cornerstone of robotics state estimation and perception thanks to their ability to provide accurate geometric information about their surroundings in the form of 3D scans. Unfortunately, most of nowadays lidars do not take snapshots of the environment but sweep the environment over a period of time (typically around 100 ms). Such a rolling-shutter-like mechanism introduces motion distortion into the collected lidar scan, thus hindering downstream perception applications. In this paper, we present a novel method for motion distortion correction of lidar data by tightly coupling lidar with Inertial Measurement Unit (IMU) data. The motivation of this work is a map-free dynamic object detection based on lidar. The proposed lidar data undistortion method relies on continuous preintegrated of IMU measurements that allow parameterising the sensors' continuous 6-DoF trajectory using solely eleven discrete state variables (biases, initial velocity, and gravity direction). The undistortion consists of feature-based distance minimisation of point-to-line and point-to-plane residuals in a non-linear least-square formulation. Given undistorted geometric data over a short temporal window, the proposed pipeline computes the spatiotemporal normal vector of each of the lidar points. The temporal component of the normals is a proxy for the corresponding point's velocity, therefore allowing for learning-free dynamic object classification without the need for registration in a global reference frame. We demonstrate the soundness of the proposed method and its different components using public datasets and compare them with state-of-the-art lidar-inertial state estimation and dynamic object detection algorithms.
comment: Paper presented at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations
Interactive dynamic simulators are an accelerator for developing novel robotic control algorithms and complex systems involving humans and robots. In user training and synthetic data generation applications, a high-fidelity visualization of the simulation is essential. Visual fidelity is dependent on the quality of the computer graphics algorithms used to render the simulated scene. Furthermore, the rendering algorithms must be implemented on the graphics processing unit (GPU) to achieve real-time performance, requiring the use of a graphics application programming interface (API). This paper presents a performance-focused and lightweight rendering engine supporting the Vulkan graphics API. The engine is designed to modernize the legacy rendering pipeline of Asynchronous Multi-Body Framework (AMBF), a dynamic simulation framework used extensively for interactive robotics simulation development. This new rendering engine implements graphical features such as physically based rendering (PBR), anti-aliasing, and ray-traced shadows, significantly improving the image quality of AMBF. Computational experiments show that the engine can render a simulated scene with over seven million triangles while maintaining GPU computation times within two milliseconds.
comment: 8 pages, 8 figures, submitted to the 2024 IEEE International Conference on Robotic Computing (IRC)
Reinforcement Learning Control for Autonomous Hydraulic Material Handling Machines with Underactuated Tools IROS 2024
The precise and safe control of heavy material handling machines presents numerous challenges due to the hard-to-model hydraulically actuated joints and the need for collision-free trajectory planning with a free-swinging end-effector tool. In this work, we propose an RL-based controller that commands the cabin joint and the arm simultaneously. It is trained in a simulation combining data-driven modeling techniques with first-principles modeling. On the one hand, we employ a neural network model to capture the highly nonlinear dynamics of the upper carriage turn hydraulic motor, incorporating explicit pressure prediction to handle delays better. On the other hand, we model the arm as velocity-controllable and the free-swinging end-effector tool as a damped pendulum using first principles. This combined model enhances our simulation environment, enabling the training of RL controllers that can be directly transferred to the real machine. Designed to reach steady-state Cartesian targets, the RL controller learns to leverage the hydraulic dynamics to improve accuracy, maintain high speeds, and minimize end-effector tool oscillations. Our controller, tested on a mid-size prototype material handler, is more accurate than an inexperienced operator and causes fewer tool oscillations. It demonstrates competitive performance even compared to an experienced professional driver.
comment: Presented at IROS 2024, Abu Dhabi, as oral presentation
HE-Nav: A High-Performance and Efficient Navigation System for Aerial-Ground Robots in Cluttered Environments
Existing AGR navigation systems have advanced in lightly occluded scenarios (e.g., buildings) by employing 3D semantic scene completion networks for voxel occupancy prediction and constructing Euclidean Signed Distance Field (ESDF) maps for collision-free path planning. However, these systems exhibit suboptimal performance and efficiency in cluttered environments with severe occlusions (e.g., dense forests or tall walls), due to limitations arising from perception networks' low prediction accuracy and path planners' high computational overhead. In this paper, we present HE-Nav, the first high-performance and efficient navigation system tailored for AGRs operating in cluttered environments. The perception module utilizes a lightweight semantic scene completion network (LBSCNet), guided by a bird's eye view (BEV) feature fusion and enhanced by an exquisitely designed SCB-Fusion module and attention mechanism. This enables real-time and efficient obstacle prediction in cluttered areas, generating a complete local map. Building upon this completed map, our novel AG-Planner employs the energy-efficient kinodynamic A* search algorithm to guarantee planning is energy-saving. Subsequent trajectory optimization processes yield safe, smooth, dynamically feasible and ESDF-free aerial-ground hybrid paths. Extensive experiments demonstrate that HE-Nav achieved 7x energy savings in real-world situations while maintaining planning success rates of 98% in simulation scenarios. Code and video are available on our project page: https://jmwang0117.github.io/HE-Nav/.
comment: Accepted to IEEE RA-L
Control-oriented Clustering of Visual Latent Representation
We initiate a study of the geometry of the visual representation space -- the information channel from the vision encoder to the action decoder -- in an image-based control pipeline learned from behavior cloning. Inspired by the phenomenon of neural collapse (NC) in image classification, we investigate whether a similar law of clustering emerges in the visual representation space. Since image-based control is a regression task without explicitly defined classes, the central piece of the puzzle lies in determining according to what implicit classes the visual features cluster, if such a law exists. Focusing on image-based planar pushing, we posit the most important role of the visual representation in a control task is to convey a goal to the action decoder. We then classify training samples of expert demonstrations into eight "control-oriented" classes based on (a) the relative pose between the object and the target in the input or (b) the relative pose of the object induced by expert actions in the output, where one class corresponds to one relative pose orthant (REPO). Across four different instantiations of architecture, we report the prevalent emergence of control-oriented clustering in the visual representation space according to the eight REPOs. Beyond empirical observation, we show such a law of clustering can be leveraged as an algorithmic tool to improve test-time performance when training a policy with limited expert demonstrations. Particularly, we pretrain the vision encoder using NC as a regularization to encourage control-oriented clustering of the visual features. Surprisingly, such an NC-pretrained vision encoder, when finetuned end-to-end with the action decoder, boosts the test-time performance by 10% to 35% in the low-data regime. Real-world vision-based planar pushing experiments confirmed the surprising advantage of control-oriented visual representation pretraining.
HE-Drive: Human-Like End-to-End Driving with Vision Language Models
In this paper, we propose HE-Drive: the first human-like-centric end-to-end autonomous driving system to generate trajectories that are both temporally consistent and comfortable. Recent studies have shown that imitation learning-based planners and learning-based trajectory scorers can effectively generate and select accuracy trajectories that closely mimic expert demonstrations. However, such trajectory planners and scorers face the dilemma of generating temporally inconsistent and uncomfortable trajectories. To solve the above problems, Our HE-Drive first extracts key 3D spatial representations through sparse perception, which then serves as conditional inputs for a Conditional Denoising Diffusion Probabilistic Models (DDPMs)-based motion planner to generate temporal consistency multi-modal trajectories. A Vision-Language Models (VLMs)-guided trajectory scorer subsequently selects the most comfortable trajectory from these candidates to control the vehicle, ensuring human-like end-to-end driving. Experiments show that HE-Drive not only achieves state-of-the-art performance (i.e., reduces the average collision rate by 71% than VAD) and efficiency (i.e., 1.9X faster than SparseDrive) on the challenging nuScenes and OpenScene datasets but also provides the most comfortable driving experience on real-world data.For more information, visit the project website: https://jmwang0117.github.io/HE-Drive/.
Can LLMs plan paths with extra hints from solvers?
Large Language Models (LLMs) have shown remarkable capabilities in natural language processing, mathematical problem solving, and tasks related to program synthesis. However, their effectiveness in long-term planning and higher-order reasoning has been noted to be limited and fragile. This paper explores an approach for enhancing LLM performance in solving a classical robotic planning task by integrating solver-generated feedback. We explore four different strategies for providing feedback, including visual feedback, we utilize fine-tuning, and we evaluate the performance of three different LLMs across a 10 standard and 100 more randomly generated planning problems. Our results suggest that the solver-generated feedback improves the LLM's ability to solve the moderately difficult problems, but the harder problems still remain out of reach. The study provides detailed analysis of the effects of the different hinting strategies and the different planning tendencies of the evaluated LLMs.
PhotoReg: Photometrically Registering 3D Gaussian Splatting Models
Building accurate representations of the environment is critical for intelligent robots to make decisions during deployment. Advances in photorealistic environment models have enabled robots to develop hyper-realistic reconstructions, which can be used to generate images that are intuitive for human inspection. In particular, the recently introduced \ac{3DGS}, which describes the scene with up to millions of primitive ellipsoids, can be rendered in real time. \ac{3DGS} has rapidly gained prominence. However, a critical unsolved problem persists: how can we fuse multiple \ac{3DGS} into a single coherent model? Solving this problem will enable robot teams to jointly build \ac{3DGS} models of their surroundings. A key insight of this work is to leverage the {duality} between photorealistic reconstructions, which render realistic 2D images from 3D structure, and \emph{3D foundation models}, which predict 3D structure from image pairs. To this end, we develop PhotoReg, a framework to register multiple photorealistic \ac{3DGS} models with 3D foundation models. As \ac{3DGS} models are generally built from monocular camera images, they have \emph{arbitrary scale}. To resolve this, PhotoReg actively enforces scale consistency among the different \ac{3DGS} models by considering depth estimates within these models. Then, the alignment is iteratively refined with fine-grained photometric losses to produce high-quality fused \ac{3DGS} models. We rigorously evaluate PhotoReg on both standard benchmark datasets and our custom-collected datasets, including with two quadruped robots. The code is released at \url{ziweny11.github.io/photoreg}.
GARField: Addressing the visual Sim-to-Real gap in garment manipulation with mesh-attached radiance fields
While humans intuitively manipulate garments and other textiles items swiftly and accurately, it is a significant challenge for robots. A factor crucial to the human performance is the ability to imagine, a priori, the intended result of the manipulation intents and hence develop predictions on the garment pose. This allows us to plan from highly obstructed states, adapt our plans as we collect more information and react swiftly to unforeseen circumstances. Robots, on the other hand, struggle to establish such intuitions and form tight links between plans and observations. This can be attributed in part to the high cost of obtaining densely labelled data for textile manipulation, both in quality and quantity. The problem of data collection is a long standing issue in data-based approaches to garment manipulation. Currently, the generation of high quality and labelled garment manipulation data is mainly attempted through advanced data capture procedures that create simplified state estimations from real-world observations. In this work, however, we propose to generate real-world observations from given object states. To achieve this, we present GARField (Garment Attached Radiance Field) a differentiable rendering architecture allowing data generation from simulated states stored as triangle meshes. Code will be available on https://ddonatien.github.io/garfield-website/
comment: Project site: https://ddonatien.github.io/garfield-website/
Active Fine-Tuning of Generalist Policies
Pre-trained generalist policies are rapidly gaining relevance in robot learning due to their promise of fast adaptation to novel, in-domain tasks. This adaptation often relies on collecting new demonstrations for a specific task of interest and applying imitation learning algorithms, such as behavioral cloning. However, as soon as several tasks need to be learned, we must decide which tasks should be demonstrated and how often? We study this multi-task problem and explore an interactive framework in which the agent adaptively selects the tasks to be demonstrated. We propose AMF (Active Multi-task Fine-tuning), an algorithm to maximize multi-task policy performance under a limited demonstration budget by collecting demonstrations yielding the largest information gain on the expert policy. We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness to efficiently fine-tune neural policies in complex and high-dimensional environments.
Enhanced Multi-Robot SLAM System with Cross-Validation Matching and Exponential Threshold Keyframe Selection
The evolving field of mobile robotics has indeed increased the demand for simultaneous localization and mapping (SLAM) systems. To augment the localization accuracy and mapping efficacy of SLAM, we refined the core module of the SLAM system. Within the feature matching phase, we introduced cross-validation matching to filter out mismatches. In the keyframe selection strategy, an exponential threshold function is constructed to quantify the keyframe selection process. Compared with a single robot, the multi-robot collaborative SLAM (CSLAM) system substantially improves task execution efficiency and robustness. By employing a centralized structure, we formulate a multi-robot SLAM system and design a coarse-to-fine matching approach for multi-map point cloud registration. Our system, built upon ORB-SLAM3, underwent extensive evaluation utilizing the TUM RGB-D, EuRoC MAV, and TUM_VI datasets. The experimental results demonstrate a significant improvement in the positioning accuracy and mapping quality of our enhanced algorithm compared to those of ORB-SLAM3, with a 12.90% reduction in the absolute trajectory error.
Anticipating Human Behavior for Safe Navigation and Efficient Collaborative Manipulation with Mobile Service Robots
The anticipation of human behavior is a crucial capability for robots to interact with humans safely and efficiently. We employ a smart edge sensor network to provide global observations along with future predictions and goal information to integrate anticipatory behavior for the control of a mobile manipulation robot. We present approaches to anticipate human behavior in the context of safe navigation and a collaborative mobile manipulation task. First, we anticipate human motion by employing projections of human trajectories from smart edge sensor network observations into the planning map of a mobile robot. Second, we anticipate human intentions in a collaborative furniture-carrying task to achieve a given goal. Our experiments indicate that anticipating human behavior allows for safer navigation and more efficient collaboration. Finally, we showcase an integrated system that anticipates human behavior and collaborates with a human to achieve a target room layout, including the placement of tables and chairs.
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Learning complex robot behavior through interactions with the environment necessitates principled exploration. Effective strategies should prioritize exploring regions of the state-action space that maximize rewards, with optimistic exploration emerging as a promising direction aligned with this idea and enabling sample-efficient reinforcement learning. However, existing methods overlook a crucial aspect: the need for optimism to be informed by a belief connecting the reward and state. To address this, we propose a practical, theoretically grounded approach to optimistic exploration based on Thompson sampling. Our model structure is the first that allows for reasoning about joint uncertainty over transitions and rewards. We apply our method on a set of MuJoCo and VMAS continuous control tasks. Our experiments demonstrate that optimistic exploration significantly accelerates learning in environments with sparse rewards, action penalties, and difficult-to-explore regions. Furthermore, we provide insights into when optimism is beneficial and emphasize the critical role of model uncertainty in guiding exploration.
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control
While MPC enables nonlinear feedback control by solving an optimal control problem at each timestep, the computational burden tends to be significantly large, making it difficult to optimize a policy within the control period. To address this issue, one possible approach is to utilize terminal value learning to reduce computational costs. However, the learned value cannot be used for other tasks in situations where the task dynamically changes in the original MPC setup. In this study, we develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization while reducing computational time. Furthermore, by using a hierarchical control structure that allows the upper-level trajectory planner to output appropriate goal-conditioned trajectories, we demonstrate that a robot model is able to generate diverse motions. We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control; thus, the robot successfully tracks a target trajectory on sloped terrain.
comment: 16 pages, 9 figures
Cloud-Based Scheduling Mechanism for Scalable and Resource-Efficient Centralized Controllers
This paper proposes a novel approach to address the challenges of deploying complex robotic software in large-scale systems, i.e., Centralized Nonlinear Model Predictive Controllers (CNMPCs) for multi-agent systems. The proposed approach is based on a Kubernetes-based scheduling mechanism designed to monitor and optimize the operation of CNMPCs, while addressing the scalability limitation of centralized control schemes. By leveraging a cluster in a real-time cloud environment, the proposed mechanism effectively offloads the computational burden of CNMPCs. Through experiments, we have demonstrated the effectiveness and performance of our system, especially in scenarios where the number of robots is subject to change. Our work contributes to the advancement of cloud-based control strategies and lays the foundation for enhanced performance in cloud-controlled robotic systems.
comment: 7 pages, 6 figures, IECON 2024
TeX-NeRF: Neural Radiance Fields from Pseudo-TeX Vision
Neural radiance fields (NeRF) has gained significant attention for its exceptional visual effects. However, most existing NeRF methods reconstruct 3D scenes from RGB images captured by visible light cameras. In practical scenarios like darkness, low light, or bad weather, visible light cameras become ineffective. Therefore, we propose TeX-NeRF, a 3D reconstruction method using only infrared images, which introduces the object material emissivity as a priori, preprocesses the infrared images using Pseudo-TeX vision, and maps the temperatures (T), emissivities (e), and textures (X) of the scene into the saturation (S), hue (H), and value (V) channels of the HSV color space, respectively. Novel view synthesis using the processed images has yielded excellent results. Additionally, we introduce 3D-TeX Datasets, the first dataset comprising infrared images and their corresponding Pseudo-TeX vision images. Experiments demonstrate that our method not only matches the quality of scene reconstruction achieved with high-quality RGB images but also provides accurate temperature estimations for objects in the scene.
Predictive Spliner: Data-Driven Overtaking in Autonomous Racing Using Opponent Trajectory Prediction
Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale autonomous racing platform using Light Detection and Ranging (LiDAR) information to perceive the opponent, Predictive Spliner outperforms State-of-the-Art (SotA) algorithms by overtaking opponents at up to 83.1% of its own speed, being on average 8.4% faster than the previous best-performing method. Additionally, it achieves an average success rate of 84.5%, which is 47.6% higher than the previous best-performing method. The method maintains computational efficiency with a Central Processing Unit (CPU) load of 22.79% and a computation time of 8.4 ms, evaluated on a Commercial off-the-Shelf (CotS) Intel i7-1165G7, making it suitable for real-time robotic applications. These results highlight the potential of Predictive Spliner to enhance the performance and safety of autonomous racing vehicles. The code for Predictive Spliner is available at: https://github.com/ForzaETH/predictive-spliner.
comment: Submitted to RA-L
Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation
Learning skills that interact with objects is of major importance for robotic manipulation. These skills can indeed serve as an efficient prior for solving various manipulation tasks. We propose a novel Skill Learning approach that discovers composable behaviors by solving a large and diverse number of autonomously generated tasks. Our method learns skills allowing the robot to consistently and robustly interact with objects in its environment. The discovered behaviors are embedded in primitives which can be composed with Hierarchical Reinforcement Learning to solve unseen manipulation tasks. In particular, we leverage Asymmetric Self-Play to discover behaviors and Multiplicative Compositional Policies to embed them. We compare our method to Skill Learning baselines and find that our skills are more interactive. Furthermore, the learned skills can be used to solve a set of unseen manipulation tasks, in simulation as well as on a real robotic platform.
comment: Accepted at the 2024 IEEE-RAS International Conference on Humanoid Robots
A Planar-Symmetric SO(3) Representation for Learning Grasp Detection
Planar-symmetric hands, such as parallel grippers, are widely adopted in both research and industrial fields. Their symmetry, however, introduces ambiguity and discontinuity in the SO(3) representation, which hinders both the training and inference of neural-network-based grasp detectors. We propose a novel SO(3) representation that can parametrize a pair of planar-symmetric poses with a single parameter set by leveraging the 2D Bingham distribution. We also detail a grasp detector based on our representation, which provides a more consistent rotation output. An intensive evaluation with multiple grippers and objects in both the simulation and the real world quantitatively shows our approach's contribution.
comment: Accepted by CoRL2024
Data-driven Diffusion Models for Enhancing Safety in Autonomous Vehicle Traffic Simulations
Safety-critical traffic scenarios are integral to the development and validation of autonomous driving systems. These scenarios provide crucial insights into vehicle responses under high-risk conditions rarely encountered in real-world settings. Recent advancements in critical scenario generation have demonstrated the superiority of diffusion-based approaches over traditional generative models in terms of effectiveness and realism. However, current diffusion-based methods fail to adequately address the complexity of driver behavior and traffic density information, both of which significantly influence driver decision-making processes. In this work, we present a novel approach to overcome these limitations by introducing adversarial guidance functions for diffusion models that incorporate behavior complexity and traffic density, thereby enhancing the generation of more effective and realistic safety-critical traffic scenarios. The proposed method is evaluated on two evaluation metrics: effectiveness and realism.The proposed method is evaluated on two evaluation metrics: effectiveness and realism, demonstrating better efficacy as compared to other state-of-the-art methods.
comment: 6 pages, 1 Figure, 2 Tables
Domains as Objectives: Domain-Uncertainty-Aware Policy Optimization through Explicit Multi-Domain Convex Coverage Set Learning
The problem of uncertainty is a feature of real world robotics problems and any control framework must contend with it in order to succeed in real applications tasks. Reinforcement Learning is no different, and epistemic uncertainty arising from model uncertainty or misspecification is a challenge well captured by the sim-to-real gap. A simple solution to this issue is domain randomization (DR), which unfortunately can result in conservative agents. As a remedy to this conservativeness, the use of universal policies that take additional information about the randomized domain has risen as an alternative solution, along with recurrent neural network-based controllers. Uncertainty-aware universal policies present a particularly compelling solution able to account for system identification uncertainties during deployment. In this paper, we reveal that the challenge of efficiently optimizing uncertainty-aware policies can be fundamentally reframed as solving the convex coverage set (CCS) problem within a multi-objective reinforcement learning (MORL) context. By introducing a novel Markov decision process (MDP) framework where each domain's performance is treated as an independent objective, we unify the training of uncertainty-aware policies with MORL approaches. This connection enables the application of MORL algorithms for domain randomization (DR), allowing for more efficient policy optimization. To illustrate this, we focus on the linear utility function, which aligns with the expectation in DR formulations, and propose a series of algorithms adapted from the MORL literature to solve the CCS, demonstrating their ability to enhance the performance of uncertainty-aware policies.
comment: 27 pages, 9 figures, 12 tables, under review by IJRR
Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting
We propose a framework for active next best view and touch selection for robotic manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit 3D scene representation for robotics, as it has the ability to represent scenes in a both photorealistic and geometrically accurate manner. However, in real-world, online robotic scenes where the number of views is limited given efficiency requirements, random view selection for 3DGS becomes impractical as views are often overlapping and redundant. We address this issue by proposing an end-to-end online training and active view selection pipeline, which enhances the performance of 3DGS in few-view robotics settings. We first elevate the performance of few-shot 3DGS with a novel semantic depth alignment method using Segment Anything Model 2 (SAM2) that we supplement with Pearson depth and surface normal loss to improve color and depth reconstruction of real-world scenes. We then extend FisherRF, a next-best-view selection method for 3DGS, to select views and touch poses based on depth uncertainty. We perform online view selection on a real robot system during live 3DGS training. We motivate our improvements to few-shot GS scenes, and extend depth-based FisherRF to them, where we demonstrate both qualitative and quantitative improvements on challenging robot scenes. For more information, please see our project page at https://armlabstanford.github.io/next-best-sense.
A Universal Formulation for Path-Parametric Planning and Control
This work presents a unified framework for path-parametric planning and control. This formulation is universal as it standardizes the entire spectrum of path-parametric techniques -- from traditional path following to more recent contouring or progress-maximizing Model Predictive Control and Reinforcement Learning -- under a single framework. The ingredients underlying this universality are twofold: First, we present a compact and efficient technique capable of computing singularity-free, smooth and differentiable moving frames. Second, we derive a spatial path parameterization of the Cartesian coordinates applicable to any arbitrary curve without prior assumptions on its parametric speed or moving frame, and that perfectly interplays with the aforementioned path parameterization method. The combination of these two ingredients leads to a planning and control framework that brings togehter existing path-parametric techniques in literature. Aiming to unify all these approaches, we open source PACOR, a software library that implements the presented content, thereby providing a self-contained toolkit for the formulation of path-parametric planning and control methods.
comment: Preprint. Code: https://github.com/jonarriza96/PACOR
FogROS2-PLR: Probabilistic Latency-Reliability For Cloud Robotics
Cloud robotics enables robots to offload computationally intensive tasks to cloud servers for performance, cost, and ease of management. However, the network and cloud computing infrastructure are not designed for reliable timing guarantees, due to fluctuating Quality-of-Service (QoS). In this work, we formulate an impossibility triangle theorem for: Latency reliability, Singleton server, and Commodity hardware. The LSC theorem suggests that providing replicated servers with uncorrelated failures can exponentially reduce the probability of missing a deadline. We present FogROS2-Probabilistic Latency Reliability (PLR) that uses multiple independent network interfaces to send requests to replicated cloud servers and uses the first response back. We design routing mechanisms to discover, connect, and route through non-default network interfaces on robots. FogROS2-PLR optimizes the selection of interfaces to servers to minimize the probability of missing a deadline. We conduct a cloud-connected driving experiment with two 5G service providers, demonstrating FogROS2-PLR effectively provides smooth service quality even if one of the service providers experiences low coverage and base station handover. We use 99 Percentile (P99) latency to evaluate anomalous long-tail latency behavior. In one experiment, FogROS2-PLR improves P99 latency by up to 3.7x compared to using one service provider. We deploy FogROS2-PLR on a physical Stretch 3 robot performing an indoor human-tracking task. Even in a fully covered Wi-Fi and 5G environment, FogROS2-PLR improves the responsiveness of the robot reducing mean latency by 36% and P99 latency by 33%.
comment: Submitted to 2025 IEEE International Conference on Robotics & Automation
MultiNash-PF: A Particle Filtering Approach for Computing Multiple Local Generalized Nash Equilibria in Trajectory Games
Modern-world robotics involves complex environments where multiple autonomous agents must interact with each other and other humans. This necessitates advanced interactive multi-agent motion planning techniques. Generalized Nash equilibrium(GNE), a solution concept in constrained game theory, provides a mathematical model to predict the outcome of interactive motion planning, where each agent needs to account for other agents in the environment. However, in practice, multiple local GNEs may exist. Finding a single GNE itself is complex as it requires solving coupled constrained optimal control problems. Furthermore, finding all such local GNEs requires exploring the solution space of GNEs, which is a challenging task. This work proposes the MultiNash-PF framework to efficiently compute multiple local GNEs in constrained trajectory games. Potential games are a class of games for which a local GNE of a trajectory game can be found by solving a single constrained optimal control problem. We propose MultiNash-PF that integrates the potential game approach with implicit particle filtering, a sample-efficient method for non-convex trajectory optimization. We first formulate the underlying game as a constrained potential game and then utilize the implicit particle filtering to identify the coarse estimates of multiple local minimizers of the game's potential function. MultiNash-PF then refines these estimates with optimization solvers, obtaining different local GNEs. We show through numerical simulations that MultiNash-PF reduces computation time by up to 50\% compared to a baseline approach.
Understanding and Imitating Human-Robot Motion with Restricted Visual Fields
When working around humans, it is important to model their perception limitations in order to predict their behavior more accurately. In this work, we consider agents with a limited field of view, viewing range, and ability to miss objects within viewing range (e.g., transparency). By considering the observation model independently from the motion policy, we can better predict the agent's behavior by considering these limitations and approximating them. We perform a user study where human operators navigate a cluttered scene while scanning the region for obstacles with a limited field of view and range. Using imitation learning, we show that a robot can adopt a human's strategy for observing an environment with limitations on observation and navigate with minimal collision with dynamic and static obstacles. We also show that this learned model helps it successfully navigate a physical hardware vehicle in real time.
Toward General Object-level Mapping from Sparse Views with 3D Diffusion Priors
Object-level mapping builds a 3D map of objects in a scene with detailed shapes and poses from multi-view sensor observations. Conventional methods struggle to build complete shapes and estimate accurate poses due to partial occlusions and sensor noise. They require dense observations to cover all objects, which is challenging to achieve in robotics trajectories. Recent work introduces generative shape priors for object-level mapping from sparse views, but is limited to single-category objects. In this work, we propose a General Object-level Mapping system, GOM, which leverages a 3D diffusion model as shape prior with multi-category support and outputs Neural Radiance Fields (NeRFs) for both texture and geometry for all objects in a scene. GOM includes an effective formulation to guide a pre-trained diffusion model with extra nonlinear constraints from sensor measurements without finetuning. We also develop a probabilistic optimization formulation to fuse multi-view sensor observations and diffusion priors for joint 3D object pose and shape estimation. Our GOM system demonstrates superior multi-category mapping performance from sparse views, and achieves more accurate mapping results compared to state-of-the-art methods on the real-world benchmarks. We will release our code: https://github.com/TRAILab/GeneralObjectMapping.
comment: Accepted by CoRL 2024
Propeller damage detection, classification and estimation in multirotor vehicles
This manuscript details an architecture and training methodology for a data-driven framework aimed at detecting, identifying, and quantifying damage in the propeller blades of multirotor Unmanned Aerial Vehicles. By substituting one propeller with a damaged counterpart-encompassing three distinct damage types of varying severity-real flight data was collected. This data was then used to train a composite model, comprising both classifiers and neural networks, capable of accurately identifying the type of failure, estimating damage severity, and pinpointing the affected rotor. The data employed for this analysis was exclusively sourced from inertial measurements and control command inputs, ensuring adaptability across diverse multirotor vehicle platforms.
comment: 24 pages, 18 figures, 9 tables
2FAST-2LAMAA: A Lidar-Inertial Localisation and Mapping Framework for Non-Static Environments
This document presents a framework for lidar-inertial localisation and mapping named 2Fast-2Lamaa. The method revolves around two main steps which are the inertial-aided undistortion of the lidar data and the scan-to-map registration using a distance-field representation of the environment. The initialisation-free undistortion uses inertial data to constrain the continuous trajectory of the sensor during the lidar scan. The eleven DoFs that fully characterise the trajectory are estimated by minimising lidar point-to-line and point-to-plane distances in a non-linear least-square formulation. The registration uses a map that provides a distance field for the environment based on Gaussian Process regression. The pose of an undistorted lidar scan is optimised to minimise the distance field queries of its points with respect to the map. After registration, the new geometric information is efficiently integrated into the map. The soundness of 2Fast-2Lamaa is demonstrated over several datasets (qualitative evaluation only). The real-time implementation is made publicly available at https://github.com/UTS-RI/2fast2lamaa.
SharpSLAM: 3D Object-Oriented Visual SLAM with Deblurring for Agile Drones
The paper focuses on the algorithm for improving the quality of 3D reconstruction and segmentation in DSP-SLAM by enhancing the RGB image quality. SharpSLAM algorithm developed by us aims to decrease the influence of high dynamic motion on visual object-oriented SLAM through image deblurring, improving all aspects of object-oriented SLAM, including localization, mapping, and object reconstruction. The experimental results revealed noticeable improvement in object detection quality, with F-score increased from 82.9% to 86.2% due to the higher number of features and corresponding map points. The RMSE of signed distance function has also decreased from 17.2 to 15.4 cm. Furthermore, our solution has enhanced object positioning, with an increase in the IoU from 74.5% to 75.7%. SharpSLAM algorithm has the potential to highly improve the quality of 3D reconstruction and segmentation in DSP-SLAM and to impact a wide range of fields, including robotics, autonomous vehicles, and augmented reality.
comment: Manuscript accepted to IEEE Telepresence 2024
CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series
The study of cause-and-effect is of the utmost importance in many branches of science, but also for many practical applications of intelligent systems. In particular, identifying causal relationships in situations that include hidden factors is a major challenge for methods that rely solely on observational data for building causal models. This paper proposes CAnDOIT, a causal discovery method to reconstruct causal models using both observational and interventional time-series data. The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics, where the scenario is highly complex and observational data alone are often insufficient to uncover the correct causal structure. Validation of the method is performed initially on randomly generated synthetic models and subsequently on a well-known benchmark for causal structure learning in a robotic manipulation environment. The experiments demonstrate that the approach can effectively handle data from interventions and exploit them to enhance the accuracy of the causal analysis. A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub: https://github.com/lcastri/causalflow.
comment: Published in Advanced Intelligent Systems
LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement
We introduce a novel approach to the executable semantic object rearrangement problem. In this challenge, a robot seeks to create an actionable plan that rearranges objects within a scene according to a pattern dictated by a natural language description. Unlike existing methods such as StructFormer and StructDiffusion, which tackle the issue in two steps by first generating poses and then leveraging a task planner for action plan formulation, our method concurrently addresses pose generation and action planning. We achieve this integration using a Language-Guided Monte-Carlo Tree Search (LGMCTS). Quantitative evaluations are provided on two simulation datasets, and complemented by qualitative tests with a real robot.
comment: Our code and supplementary materials are accessible at https://github.com/changhaonan/LG-MCTS
Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK)
This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent's application to the physical realm, employing robots to provide a more captivating and interactive experience. The proposed system, named the Masquerading Animated Social Kinematic (MASK), leverages an anthropomorphic robot which interacts with guests using non-verbal interactions, including facial expressions and gestures. A behavior generation system based upon a finite-state machine structure effectively conditions robotic behavior to convey distinct personas. The MASK framework integrates a perception engine, a behavior selection engine, and a comprehensive action library to enable real-time, dynamic interactions with minimal human intervention in behavior design. Throughout the user subject studies, we examined whether the users could recognize the intended character in both personality- and film-character-based persona conditions. We conclude by discussing the role of personas in interactive agents and the factors to consider for creating an engaging user experience.
comment: Accepted at Robotics and Automation Letters
SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Automating garment manipulation poses a significant challenge for assistive robotics due to the diverse and deformable nature of garments. Traditional approaches typically require separate models for each garment type, which limits scalability and adaptability. In contrast, this paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories. By interpreting both visual and semantic information, our model enables robots to manage different garment states with a single model. We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data. Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates, providing a more flexible and general solution for robotic garment manipulation. In addition, this research also underscores the potential of VLMs to unify various garment manipulation tasks within a single framework, paving the way for broader applications in home automation and assistive robotics for future.
Entropy-Based Uncertainty Modeling for Trajectory Prediction in Autonomous Driving
In autonomous driving, accurate motion prediction is essential for safe and efficient motion planning. To ensure safety, planners must rely on reliable uncertainty information about the predicted future behavior of surrounding agents, yet this aspect has received limited attention. This paper addresses the so-far neglected problem of uncertainty modeling in trajectory prediction. We adopt a holistic approach that focuses on uncertainty quantification, decomposition, and the influence of model composition. Our method is based on a theoretically grounded information-theoretic approach to measure uncertainty, allowing us to decompose total uncertainty into its aleatoric and epistemic components. We conduct extensive experiments on the nuScenes dataset to assess how different model architectures and configurations affect uncertainty quantification and model robustness.
comment: 10 pages, 5 figures, submitted to International Conference on Learning Representations (2025)
VILENS: Visual, Inertial, Lidar, and Leg Odometry for All-Terrain Legged Robots
We present visual inertial lidar legged navigation system (VILENS), an odometry system for legged robots based on factor graphs. The key novelty is the tight fusion of four different sensor modalities to achieve reliable operation when the individual sensors would otherwise produce degenerate estimation. To minimize leg odometry drift, we extend the robot's state with a linear velocity bias term, which is estimated online. This bias is observable because of the tight fusion of this preintegrated velocity factor with vision, lidar, and inertial measurement unit (IMU) factors. Extensive experimental validation on different ANYmal quadruped robots is presented, for a total duration of 2 h and 1.8 km traveled. The experiments involved dynamic locomotion over loose rocks, slopes, and mud, which caused challenges such as slippage and terrain deformation. Perceptual challenges included dark and dusty underground caverns, and open and feature-deprived areas. We show an average improvement of 62% translational and 51% rotational errors compared to a state-of-the-art loosely coupled approach. To demonstrate its robustness, VILENS was also integrated with a perceptive controller and a local path planner.
comment: Video: https://youtu.be/NG4pkjJKhus
A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches
Task and Motion Planning (TAMP) integrates high-level task planning and low-level motion planning to equip robots with the autonomy to effectively reason over long-horizon, dynamic tasks. Optimization-based TAMP focuses on hybrid optimization approaches that define goal conditions via objective functions and are capable of handling open-ended goals, robotic dynamics, and physical interaction between the robot and the environment. Therefore, optimization-based TAMP is particularly suited to solve highly complex, contact-rich locomotion and manipulation problems. This survey provides a comprehensive review on optimization-based TAMP, covering (i) planning domain representations, including action description languages and temporal logic, (ii) individual solution strategies for components of TAMP, including AI planning and trajectory optimization (TO), and (iii) the dynamic interplay between logic-based task planning and model-based TO. A particular focus of this survey is to highlight the algorithm structures to efficiently solve TAMP, especially hierarchical and distributed approaches. Additionally, the survey emphasizes the synergy between the classical methods and contemporary learning-based innovations such as large language models. Furthermore, the future research directions for TAMP is discussed in this survey, highlighting both algorithmic and application-specific challenges.
comment: 26 pages, 13 figures, published at IEEE/ASME Transactions on Mechatronics
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
Multi-task reinforcement learning (MTRL) aims to learn several tasks simultaneously for better sample efficiency than learning them separately. Traditional methods achieve this by sharing parameters or relabeled data between tasks. In this work, we introduce a new framework for sharing behavioral policies across tasks, which can be used in addition to existing MTRL methods. The key idea is to improve each task's off-policy data collection by employing behaviors from other task policies. Selectively sharing helpful behaviors acquired in one task to collect training data for another task can lead to higher-quality trajectories, leading to more sample-efficient MTRL. Thus, we introduce a simple and principled framework called Q-switch mixture of policies (QMP) that selectively shares behavior between different task policies by using the task's Q-function to evaluate and select useful shareable behaviors. We theoretically analyze how QMP improves the sample efficiency of the underlying RL algorithm. Our experiments show that QMP's behavioral policy sharing provides complementary gains over many popular MTRL algorithms and outperforms alternative ways to share behaviors in various manipulation, locomotion, and navigation environments. Videos are available at https://qmp-mtrl.github.io.
Bayesian Optimization for Sample-Efficient Policy Improvement in Robotic Manipulation IROS
Sample efficient learning of manipulation skills poses a major challenge in robotics. While recent approaches demonstrate impressive advances in the type of task that can be addressed and the sensing modalities that can be incorporated, they still require large amounts of training data. Especially with regard to learning actions on robots in the real world, this poses a major problem due to the high costs associated with both demonstrations and real-world robot interactions. To address this challenge, we introduce BOpt-GMM, a hybrid approach that combines imitation learning with own experience collection. We first learn a skill model as a dynamical system encoded in a Gaussian Mixture Model from a few demonstrations. We then improve this model with Bayesian optimization building on a small number of autonomous skill executions in a sparse reward setting. We demonstrate the sample efficiency of our approach on multiple complex manipulation skills in both simulations and real-world experiments. Furthermore, we make the code and pre-trained models publicly available at http://bopt-gmm. cs.uni-freiburg.de.
comment: 8 pages, 5 figures, 2 tables, Accepted at the 2024 IEEE International Conference on Intelligent Robots and Systems (IROS)
Auto-Multilift: Distributed Learning and Control for Cooperative Load Transportation With Quadrotors
Designing motion control and planning algorithms for multilift systems remains challenging due to the complexities of dynamics, collision avoidance, actuator limits, and scalability. Existing methods that use optimization and distributed techniques effectively address these constraints and scalability issues. However, they often require substantial manual tuning, leading to suboptimal performance. This paper proposes Auto-Multilift, a novel framework that automates the tuning of model predictive controllers (MPCs) for multilift systems. We model the MPC cost functions with deep neural networks (DNNs), enabling fast online adaptation to various scenarios. We develop a distributed policy gradient algorithm to train these DNNs efficiently in a closed-loop manner. Central to our algorithm is distributed sensitivity propagation, which is built on fully exploiting the unique dynamic couplings within the multilift system. It parallelizes gradient computation across quadrotors and focuses on actual system state sensitivities relative to key MPC parameters. Extensive simulations demonstrate favorable scalability to a large number of quadrotors. Our method outperforms a state-of-the-art open-loop MPC tuning approach by effectively learning adaptive MPCs from trajectory tracking errors. It also excels in learning an adaptive reference for reconfiguring the system when traversing multiple narrow slots.
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training NeurIPS 2024
Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks and interactions with the physical world. Promising prospects arise for utilizing actionless human videos for pre-training and transferring the knowledge to facilitate robot policy learning through limited robot demonstrations. However, it remains a challenge due to the domain gap between humans and robots. Moreover, it is difficult to extract useful information representing the dynamic world from human videos, because of its noisy and multimodal data structure. In this paper, we introduce a novel framework to tackle these challenges, which leverages a unified discrete diffusion to combine generative pre-training on human videos and policy fine-tuning on a small number of action-labeled robot videos. We start by compressing both human and robot videos into unified video tokens. In the pre-training stage, we employ a discrete diffusion model with a mask-and-replace diffusion strategy to predict future video tokens in the latent space. In the fine-tuning stage, we harness the imagined future videos to guide low-level action learning with a limited set of robot data. Experiments demonstrate that our method generates high-fidelity future videos for planning and enhances the fine-tuned policies compared to previous state-of-the-art approaches with superior performance. Our project website is available at https://video-diff.github.io/.
comment: Accepted by NeurIPS 2024. 24 pages
TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field Optimization
The reliance on accurate camera poses is a significant barrier to the widespread deployment of Neural Radiance Fields (NeRF) models for 3D reconstruction and SLAM tasks. The existing method introduces monocular depth priors to jointly optimize the camera poses and NeRF, which fails to fully exploit the depth priors and neglects the impact of their inherent noise. In this paper, we propose Truncated Depth NeRF (TD-NeRF), a novel approach that enables training NeRF from unknown camera poses - by jointly optimizing learnable parameters of the radiance field and camera poses. Our approach explicitly utilizes monocular depth priors through three key advancements: 1) we propose a novel depth-based ray sampling strategy based on the truncated normal distribution, which improves the convergence speed and accuracy of pose estimation; 2) to circumvent local minima and refine depth geometry, we introduce a coarse-to-fine training strategy that progressively improves the depth precision; 3) we propose a more robust inter-frame point constraint that enhances robustness against depth noise during training. The experimental results on three datasets demonstrate that TD-NeRF achieves superior performance in the joint optimization of camera pose and NeRF, surpassing prior works, and generates more accurate depth geometry. The implementation of our method has been released at https://github.com/nubot-nudt/TD-NeRF.
Safe Multi-Agent Reinforcement Learning for Behavior-Based Cooperative Navigation
In this paper, we address the problem of behavior-based cooperative navigation of mobile robots using safe multi-agent reinforcement learning~(MARL). Our work is the first to focus on cooperative navigation without individual reference targets for the robots, using a single target for the formation's centroid. This eliminates the complexities involved in having several path planners to control a team of robots. To ensure safety, our MARL framework uses model predictive control (MPC) to prevent actions that could lead to collisions during training and execution. We demonstrate the effectiveness of our method in simulation and on real robots, achieving safe behavior-based cooperative navigation without using individual reference targets, with zero collisions, and faster target reaching compared to baselines. Finally, we study the impact of MPC safety filters on the learning process, revealing that we achieve faster convergence during training and we show that our approach can be safely deployed on real robots, even during early stages of the training.
Centroidal State Estimation based on the Koopman Embedding for Dynamic Legged Locomotion IROS 2024
In this paper, we introduce a novel approach to centroidal state estimation, which plays a crucial role in predictive model-based control strategies for dynamic legged locomotion. Our approach uses the Koopman operator theory to transform the robot's complex nonlinear dynamics into a linear system, by employing dynamic mode decomposition and deep learning for model construction. We evaluate both models on their linearization accuracy and capability to capture both fast and slow dynamic system responses. We then select the most suitable model for estimation purposes, and integrate it within a moving horizon estimator. This estimator is formulated as a convex quadratic program to facilitate robust, real-time centroidal state estimation. Through extensive simulation experiments on a quadruped robot executing various dynamic gaits, our data-driven framework outperforms conventional Extended Kalman Filtering technique based on nonlinear dynamics. Our estimator addresses challenges posed by force/torque measurement noise in highly dynamic motions and accurately recovers the centroidal states, demonstrating the adaptability and effectiveness of the Koopman-based linear representation for complex locomotive behaviors. Importantly, our model based on dynamic mode decomposition, trained with two locomotion patterns (trot and jump), successfully estimates the centroidal states for a different motion (bound) without retraining.
comment: Accepted in IROS 2024
A Framework for Guided Motion Planning
Randomized sampling based algorithms are widely used in robot motion planning due to the problem's intractability, and are experimentally effective on a wide range of problem instances. Most variants bias their sampling using various heuristics related to the known underlying structure of the search space. In this work, we formalize the intuitive notion of guided search by defining the concept of a guiding space. This new language encapsulates many seemingly distinct prior methods under the same framework, and allows us to reason about guidance, a previously obscured core contribution of different algorithms. We suggest an information theoretic method to evaluate guidance, which experimentally matches intuition when tested on known algorithms in a variety of environments. The language and evaluation of guidance suggests improvements to existing methods, and allows for simple hybrid algorithms that combine guidance from multiple sources.
SplatSim: Zero-Shot Sim2Real Transfer of RGB Manipulation Policies Using Gaussian Splatting
Sim2Real transfer, particularly for manipulation policies relying on RGB images, remains a critical challenge in robotics due to the significant domain shift between synthetic and real-world visual data. In this paper, we propose SplatSim, a novel framework that leverages Gaussian Splatting as the primary rendering primitive to reduce the Sim2Real gap for RGB-based manipulation policies. By replacing traditional mesh representations with Gaussian Splats in simulators, SplatSim produces highly photorealistic synthetic data while maintaining the scalability and cost-efficiency of simulation. We demonstrate the effectiveness of our framework by training manipulation policies within SplatSim and deploying them in the real world in a zero-shot manner, achieving an average success rate of 86.25%, compared to 97.5% for policies trained on real-world data. Videos can be found on our project page: https://splatsim.github.io
ViewActive: Active viewpoint optimization from a single image
When observing objects, humans benefit from their spatial visualization and mental rotation ability to envision potential optimal viewpoints based on the current observation. This capability is crucial for enabling robots to achieve efficient and robust scene perception during operation, as optimal viewpoints provide essential and informative features for accurately representing scenes in 2D images, thereby enhancing downstream tasks. To endow robots with this human-like active viewpoint optimization capability, we propose ViewActive, a modernized machine learning approach drawing inspiration from aspect graph, which provides viewpoint optimization guidance based solely on the current 2D image input. Specifically, we introduce the 3D Viewpoint Quality Field (VQF), a compact and consistent representation of viewpoint quality distribution similar to an aspect graph, composed of three general-purpose viewpoint quality metrics: self-occlusion ratio, occupancy-aware surface normal entropy, and visual entropy. We utilize pre-trained image encoders to extract robust visual and semantic features, which are then decoded into the 3D VQF, allowing our model to generalize effectively across diverse objects, including unseen categories. The lightweight ViewActive network (72 FPS on a single GPU) significantly enhances the performance of state-of-the-art object recognition pipelines and can be integrated into real-time motion planning for robotic applications. Our code and dataset are available here: https://github.com/jiayi-wu-umd/ViewActive.
Real-World Cooking Robot System from Recipes Based on Food State Recognition Using Foundation Models and PDDL
Although there is a growing demand for cooking behaviours as one of the expected tasks for robots, a series of cooking behaviours based on new recipe descriptions by robots in the real world has not yet been realised. In this study, we propose a robot system that integrates real-world executable robot cooking behaviour planning using the Large Language Model (LLM) and classical planning of PDDL descriptions, and food ingredient state recognition learning from a small number of data using the Vision-Language model (VLM). We succeeded in experiments in which PR2, a dual-armed wheeled robot, performed cooking from arranged new recipes in a real-world environment, and confirmed the effectiveness of the proposed system.
comment: Accepted at Advanced Robotics, website - https://kanazawanaoaki.github.io/cook-from-recipe-pddl/
Adaptive Step Duration for Precise Foot Placement: Achieving Robust Bipedal Locomotion on Terrains with Restricted Footholds ICRA 2025
Traditional one-step preview planning algorithms for bipedal locomotion struggle to generate viable gaits when walking across terrains with restricted footholds, such as stepping stones. To overcome such limitations, this paper introduces a novel multi-step preview foot placement planning algorithm based on the step-to-step discrete evolution of the Divergent Component of Motion (DCM) of walking robots. Our proposed approach adaptively changes the step duration and the swing foot trajectory for optimal foot placement under constraints, thereby enhancing the long-term stability of the robot and significantly improving its ability to navigate environments with tight constraints on viable footholds. We demonstrate its effectiveness through various simulation scenarios with complex stepping-stone configurations and external perturbations. These tests underscore its improved performance for navigating foothold-restricted terrains, even with external disturbances.
comment: 7 pages, 7 figures, submitted to ICRA 2025, for associated simulation video, see https://youtu.be/DjH69m1kbnM
A Complete Algorithm for a Moving Target Traveling Salesman Problem with Obstacles
The moving target traveling salesman problem with obstacles (MT-TSP-O) is a generalization of the traveling salesman problem (TSP) where, as its name suggests, the targets are moving. A solution to the MT-TSP-O is a trajectory that visits each moving target during a certain time window(s), and this trajectory avoids stationary obstacles. We assume each target moves at a constant velocity during each of its time windows. The agent has a speed limit, and this speed limit is no smaller than any target's speed. This paper presents the first complete algorithm for finding feasible solutions to the MT-TSP-O. Our algorithm builds a tree where the nodes are agent trajectories intercepting a unique sequence of targets within a unique sequence of time windows. We generate each of a parent node's children by extending the parent's trajectory to intercept one additional target, each child corresponding to a different choice of target and time window. This extension consists of planning a trajectory from the parent trajectory's final point in space-time to a moving target. To solve this point-to-moving-target subproblem, we define a novel generalization of a visibility graph called a moving target visibility graph (MTVG). Our overall algorithm is called MTVG-TSP. To validate MTVG-TSP, we test it on 570 instances with up to 30 targets. We implement a baseline method that samples trajectories of targets into points, based on prior work on special cases of the MT-TSP-O. MTVG-TSP finds feasible solutions in all cases where the baseline does, and when the sum of the targets' time window lengths enters a critical range, MTVG-TSP finds a feasible solution with up to 38 times less computation time.
comment: Accepted to WAFR 2024
Initialization of Monocular Visual Navigation for Autonomous Agents Using Modified Structure from Small Motion
We propose a standalone monocular visual Simultaneous Localization and Mapping (vSLAM) initialization pipeline for autonomous space robots. Our method, a state-of-the-art factor graph optimization pipeline, extends Structure from Small Motion (SfSM) to robustly initialize a monocular agent in spacecraft inspection trajectories, addressing visual estimation challenges such as weak-perspective projection and center-pointing motion, which exacerbates the bas-relief ambiguity, dominant planar geometry, which causes motion estimation degeneracies in classical Structure from Motion, and dynamic illumination conditions, which reduce the survivability of visual information. We validate our approach on realistic, simulated satellite inspection image sequences with a tumbling spacecraft and demonstrate the method's effectiveness over existing monocular initialization procedures.
comment: 6 pages, 1 page for references, 6 figures, 1 table, IEEEtran format. This work has been submitted to ACC for possible publication as an invited session paper. Copyright may be transferred without notice, after which this version may no longer be accessible
Decision-theoretic MPC: Motion Planning with Weighted Maneuver Preferences Under Uncertainty
Continuous optimization based motion planners require specifying a maneuver class before calculating the optimal trajectory for that class. In traffic, the intentions of other participants are often unclear, presenting multiple maneuver options for the autonomous vehicle. This uncertainty can make it difficult for the vehicle to decide on the best option. This work introduces a continuous optimization based motion planner that combines multiple maneuvers by weighting the trajectory of each maneuver according to the vehicle's preferences. In this way, the planner eliminates the need for committing to a single maneuver. To maintain safety despite this increased complexity, the planner considers uncertainties ranging from perception to prediction, while ensuring the feasibility of a chance-constrained emergency maneuver. Evaluations in both driving experiments and simulation studies show enhanced interaction capabilities and comfort levels compared to conventional planners, which consider only a single maneuver.
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy. We prove that this method results in the maximal robust RAS set in the absence of training errors and demonstrate that it enables RAS in complex environments, scales to high-dimensional systems, and achieves higher success rates for the RAS task compared to previous methods, validated through one simulation and two high-dimensional experiments.
Fast Explicit-Input Assistance for Teleoperation in Clutter
The performance of prediction-based assistance for robot teleoperation degrades in unseen or goal-rich environments due to incorrect or quickly-changing intent inferences. Poor predictions can confuse operators or cause them to change their control input to implicitly signal their goal. We present a new assistance interface for robotic manipulation where an operator can explicitly communicate a manipulation goal by pointing the end-effector. The pointing target specifies a region for local pose generation and optimization, providing interactive control over grasp and placement pose candidates. We compare the explicit pointing interface to an implicit inference-based assistance scheme in a within-subjects user study (N=20) where participants teleoperate a simulated robot to complete a multi-step singulation and stacking task in cluttered environments. We find that operators prefer the explicit interface, experience fewer pick failures and report lower cognitive workload. Our code is available at: https://github.com/NVlabs/fast-explicit-teleop
Multiagent Systems
Cloud-Based Scheduling Mechanism for Scalable and Resource-Efficient Centralized Controllers
This paper proposes a novel approach to address the challenges of deploying complex robotic software in large-scale systems, i.e., Centralized Nonlinear Model Predictive Controllers (CNMPCs) for multi-agent systems. The proposed approach is based on a Kubernetes-based scheduling mechanism designed to monitor and optimize the operation of CNMPCs, while addressing the scalability limitation of centralized control schemes. By leveraging a cluster in a real-time cloud environment, the proposed mechanism effectively offloads the computational burden of CNMPCs. Through experiments, we have demonstrated the effectiveness and performance of our system, especially in scenarios where the number of robots is subject to change. Our work contributes to the advancement of cloud-based control strategies and lays the foundation for enhanced performance in cloud-controlled robotic systems.
comment: 7 pages, 6 figures, IECON 2024
Adversarial Multi-Agent Evaluation of Large Language Models through Iterative Debates
This paper explores optimal architectures for evaluating the outputs of large language models (LLMs) using LLMs themselves. We propose a novel framework that interprets LLMs as advocates within an ensemble of interacting agents, allowing them to defend their answers and reach conclusions through a judge and jury system. This approach offers a more dynamic and comprehensive evaluation process compared to traditional human-based assessments or automated metrics. We discuss the motivation behind this framework, its key components, and comparative advantages. We also present a probabilistic model to evaluate the error reduction achieved by iterative advocate systems. Finally, we outline experiments to validate the effectiveness of multi-advocate architectures and discuss future research directions.
Online Dynamic Pricing for Electric Vehicle Charging Stations with Reservations
The transition to electric vehicles (EVs), coupled with the rise of renewable energy sources, will significantly impact the electric grid. Unlike conventional fuel sources, electricity for EVs is constrained by grid capacity, price fluctuations, and long EV charging times, requiring new pricing solutions to manage demand and supply. This paper proposes a model for online dynamic pricing of reserved EV charging services, including reservation, parking, and charging as a bundled service priced as a whole. Our approach focuses on the individual charging station operator, employing a stochastic demand model and online dynamic pricing based on expected demand. The proposed model uses a Markov Decision Process (MDP) formulation to optimize sequential pricing decisions for charging session requests. A key contribution is the novel definition and quantification of discretization error introduced by the discretization of the Poisson process for use in the MDP. The model's viability is demonstrated with a heuristic solution method based on Monte-Carlo tree search, offering a viable path for real-world application.
comment: 45 pages, 11 figure, prepared for submission to IEEE Transactions on Intelligent Transportation Systems (T-ITS)
Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts
The increasing deployment of AI is shaping the future landscape of the internet, which is set to become an integrated ecosystem of AI agents. Orchestrating the interaction among AI agents necessitates decentralized, self-sustaining mechanisms that harmonize the tension between individual interests and social welfare. In this paper we tackle this challenge by synergizing reinforcement learning with principal-agent theory from economics. Taken separately, the former allows unrealistic freedom of intervention, while the latter struggles to scale in sequential settings. Combining them achieves the best of both worlds. We propose a framework where a principal guides an agent in a Markov Decision Process (MDP) using a series of contracts, which specify payments by the principal based on observable outcomes of the agent's actions. We present and analyze a meta-algorithm that iteratively optimizes the policies of the principal and agent, showing its equivalence to a contraction operator on the principal's Q-function, and its convergence to subgame-perfect equilibrium. We then scale our algorithm with deep Q-learning and analyze its convergence in the presence of approximation error, both theoretically and through experiments with randomly generated binary game-trees. Extending our framework to multiple agents, we apply our methodology to the combinatorial Coin Game. Addressing this multi-agent sequential social dilemma is a promising first step toward scaling our approach to more complex, real-world instances.
Learning to Steer Markovian Agents under Model Uncertainty
Designing incentives for an adapting population is a ubiquitous problem in a wide array of economic applications and beyond. In this work, we study how to design additional rewards to steer multi-agent systems towards desired policies \emph{without} prior knowledge of the agents' underlying learning dynamics. Motivated by the limitation of existing works, we consider a new and general category of learning dynamics called \emph{Markovian agents}. We introduce a model-based non-episodic Reinforcement Learning (RL) formulation for our steering problem. Importantly, we focus on learning a \emph{history-dependent} steering strategy to handle the inherent model uncertainty about the agents' learning dynamics. We introduce a novel objective function to encode the desiderata of achieving a good steering outcome with reasonable cost. Theoretically, we identify conditions for the existence of steering strategies to guide agents to the desired policies. Complementing our theoretical contributions, we provide empirical algorithms to approximately solve our objective, which effectively tackles the challenge in learning history-dependent strategies. We demonstrate the efficacy of our algorithms through empirical evaluations.
comment: 34 Pages
An active learning method for solving competitive multi-agent decision-making and control problems
To identify a stationary action profile for a population of competitive agents, each executing private strategies, we introduce a novel active-learning scheme where a centralized external observer (or entity) can probe the agents' reactions and recursively update simple local parametric estimates of the action-reaction mappings. Under very general working assumptions (not even assuming that a stationary profile exists), sufficient conditions are established to assess the asymptotic properties of the proposed active learning methodology so that, if the parameters characterizing the action-reaction mappings converge, a stationary action profile is achieved. Such conditions hence act also as certificates for the existence of such a profile. Extensive numerical simulations involving typical competitive multi-agent control and decision-making problems illustrate the practical effectiveness of the proposed learning-based approach.
comment: Python package available at https://github.com/bemporad/gnep-learn
Multi-agent reinforcement learning using echo-state network and its application to pedestrian dynamics
In recent years, simulations of pedestrians using the multi-agent reinforcement learning (MARL) have been studied. This study considered the roads on a grid-world environment, and implemented pedestrians as MARL agents using an echo-state network and the least squares policy iteration method. Under this environment, the ability of these agents to learn to move forward by avoiding other agents was investigated. Specifically, we considered two types of tasks: the choice between a narrow direct route and a broad detour, and the bidirectional pedestrian flow in a corridor. The simulations results indicated that the learning was successful when the density of the agents was not that high.
comment: 25 pages, 19 figures
Nonparametric Strategy Test
We present a nonparametric statistical test for determining whether an agent is following a given mixed strategy in a repeated strategic-form game given samples of the agent's play. This involves two components: determining whether the agent's frequencies of pure strategies are sufficiently close to the target frequencies, and determining whether the pure strategies selected are independent between different game iterations. Our integrated test involves applying a chi-squared goodness of fit test for the first component and a generalized Wald-Wolfowitz runs test for the second component. The results from both tests are combined using Bonferroni correction to produce a complete test for a given significance level $\alpha.$ We applied the test to publicly available data of human rock-paper-scissors play. The data consists of 50 iterations of play for 500 human players. We test with a null hypothesis that the players are following a uniform random strategy independently at each game iteration. Using a significance level of $\alpha = 0.05$, we conclude that 305 (61%) of the subjects are following the target strategy.
Systems and Control (CS)
Upgrading SPHERE with the second stage AO system SAXO+: frequency-based data-driven controller for adaptive optics
This study introduces a novel frequency-based data-driven controller for adaptive optics, using power spectral density for optimization while ensuring stability criteria. It addresses disturbance rejection, command amplitude constraints and system transfer functions through convex optimization to obtain an optimal control in an infinite input response filter form. Evaluated within the SAXO+ project, it demonstrates efficacy under diverse atmospheric conditions and operational scenarios. The proposed controller is tested in both standard and disentangled adaptive optics schemes, showcasing its adaptability and performance. Experimental validation is conducted using the COMPASS simulation tool, affirming the controller's promise for enhancing adaptive optics systems in real-world applications.
AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search
Quantum computers have the potential to outperform classical computers in important tasks such as optimization and number factoring. They are characterized by limited connectivity, which necessitates the routing of their computational bits, known as qubits, to specific locations during program execution to carry out quantum operations. Traditionally, the NP-hard optimization problem of minimizing the routing overhead has been addressed through sub-optimal rule-based routing techniques with inherent human biases embedded within the cost function design. This paper introduces a solution that integrates Monte Carlo Tree Search (MCTS) with Reinforcement Learning (RL). Our RL-based router, called AlphaRouter, outperforms the current state-of-the-art routing methods and generates quantum programs with up to $20\%$ less routing overhead, thus significantly enhancing the overall efficiency and feasibility of quantum computing.
comment: 11 pages, 11 figures, International Conference on Quantum Computing and Engineering - QCE24
Reinforcement Learning Control for Autonomous Hydraulic Material Handling Machines with Underactuated Tools IROS 2024
The precise and safe control of heavy material handling machines presents numerous challenges due to the hard-to-model hydraulically actuated joints and the need for collision-free trajectory planning with a free-swinging end-effector tool. In this work, we propose an RL-based controller that commands the cabin joint and the arm simultaneously. It is trained in a simulation combining data-driven modeling techniques with first-principles modeling. On the one hand, we employ a neural network model to capture the highly nonlinear dynamics of the upper carriage turn hydraulic motor, incorporating explicit pressure prediction to handle delays better. On the other hand, we model the arm as velocity-controllable and the free-swinging end-effector tool as a damped pendulum using first principles. This combined model enhances our simulation environment, enabling the training of RL controllers that can be directly transferred to the real machine. Designed to reach steady-state Cartesian targets, the RL controller learns to leverage the hydraulic dynamics to improve accuracy, maintain high speeds, and minimize end-effector tool oscillations. Our controller, tested on a mid-size prototype material handler, is more accurate than an inexperienced operator and causes fewer tool oscillations. It demonstrates competitive performance even compared to an experienced professional driver.
comment: Presented at IROS 2024, Abu Dhabi, as oral presentation
Function Gradient Approximation with Random Shallow ReLU Networks with Control Applications
Neural networks are widely used to approximate unknown functions in control. A common neural network architecture uses a single hidden layer (i.e. a shallow network), in which the input parameters are fixed in advance and only the output parameters are trained. The typical formal analysis asserts that if output parameters exist to approximate the unknown function with sufficient accuracy, then desired control performance can be achieved. A long-standing theoretical gap was that no conditions existed to guarantee that, for the fixed input parameters, required accuracy could be obtained by training the output parameters. Our recent work has partially closed this gap by demonstrating that if input parameters are chosen randomly, then for any sufficiently smooth function, with high-probability there are output parameters resulting in $O((1/m)^{1/2})$ approximation errors, where $m$ is the number of neurons. However, some applications, notably continuous-time value function approximation, require that the network approximates the both the unknown function and its gradient with sufficient accuracy. In this paper, we show that randomly generated input parameters and trained output parameters result in gradient errors of $O((\log(m)/m)^{1/2})$, and additionally, improve the constants from our prior work. We show how to apply the result to policy evaluation problems.
comment: Under Review for American Control Conference, 2025
Safe Learning-Based Optimization of Model Predictive Control: Application to Battery Fast-Charging
Model predictive control (MPC) is a powerful tool for controlling complex nonlinear systems under constraints, but often struggles with model uncertainties and the design of suitable cost functions. To address these challenges, we discuss an approach that integrates MPC with safe Bayesian optimization to optimize long-term closed-loop performance despite significant model-plant mismatches. By parameterizing the MPC stage cost function using a radial basis function network, we employ Bayesian optimization as a multi-episode learning strategy to tune the controller without relying on precise system models. This method mitigates conservativeness introduced by overly cautious soft constraints in the MPC cost function and provides probabilistic safety guarantees during learning, ensuring that safety-critical constraints are met with high probability. As a practical application, we apply our approach to fast charging of lithium-ion batteries, a challenging task due to the complicated battery dynamics and strict safety requirements, subject to the requirement to be implementable in real time. Simulation results demonstrate that, in the context of model-plant mismatch, our method reduces charging times compared to traditional MPC methods while maintaining safety. This work extends previous research by emphasizing closed-loop constraint satisfaction and offers a promising solution for enhancing performance in systems where model uncertainties and safety are critical concerns.
comment: 7 pages, 4 figures, submitted to ACC 2025
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control
While MPC enables nonlinear feedback control by solving an optimal control problem at each timestep, the computational burden tends to be significantly large, making it difficult to optimize a policy within the control period. To address this issue, one possible approach is to utilize terminal value learning to reduce computational costs. However, the learned value cannot be used for other tasks in situations where the task dynamically changes in the original MPC setup. In this study, we develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization while reducing computational time. Furthermore, by using a hierarchical control structure that allows the upper-level trajectory planner to output appropriate goal-conditioned trajectories, we demonstrate that a robot model is able to generate diverse motions. We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control; thus, the robot successfully tracks a target trajectory on sloped terrain.
comment: 16 pages, 9 figures
Cloud-Based Scheduling Mechanism for Scalable and Resource-Efficient Centralized Controllers
This paper proposes a novel approach to address the challenges of deploying complex robotic software in large-scale systems, i.e., Centralized Nonlinear Model Predictive Controllers (CNMPCs) for multi-agent systems. The proposed approach is based on a Kubernetes-based scheduling mechanism designed to monitor and optimize the operation of CNMPCs, while addressing the scalability limitation of centralized control schemes. By leveraging a cluster in a real-time cloud environment, the proposed mechanism effectively offloads the computational burden of CNMPCs. Through experiments, we have demonstrated the effectiveness and performance of our system, especially in scenarios where the number of robots is subject to change. Our work contributes to the advancement of cloud-based control strategies and lays the foundation for enhanced performance in cloud-controlled robotic systems.
comment: 7 pages, 6 figures, IECON 2024
Active Inference for Closed-loop transmit beamsteering in Fetal Doppler Ultrasound
Doppler ultrasound is widely used to monitor fetal heart rate during labor and pregnancy. Unfortunately, it is highly sensitive to fetal and maternal movements, which can cause the displacement of the fetal heart with respect to the ultrasound beam, in turn reducing the Doppler signal-to-noise ratio and leading to erratic, noisy, or missing heart rate readings. To tackle this issue, we augment the conventional Doppler ultrasound system with a rational agent that autonomously steers the ultrasound beam to track the position of the fetal heart. The proposed cognitive ultrasound system leverages a sequential Monte Carlo method to infer the fetal heart position from the power Doppler signal, and employs a greedy information-seeking criterion to select the steering angle that minimizes the positional uncertainty for future timesteps. The fetal heart rate is then calculated using the Doppler signal at the estimated fetal heart position. Our results show that the system can accurately track the fetal heart position across challenging signal-to-noise ratio scenarios, mainly thanks to its dynamic transmit beam steering capability. Additionally, we find that optimizing the transmit beamsteering to minimize positional uncertainty also optimizes downstream heart rate estimation performance. In conclusion, this work showcases the power of closed-loop cognitive ultrasound in boosting the capabilities of traditional systems.
Predictive Spliner: Data-Driven Overtaking in Autonomous Racing Using Opponent Trajectory Prediction
Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale autonomous racing platform using Light Detection and Ranging (LiDAR) information to perceive the opponent, Predictive Spliner outperforms State-of-the-Art (SotA) algorithms by overtaking opponents at up to 83.1% of its own speed, being on average 8.4% faster than the previous best-performing method. Additionally, it achieves an average success rate of 84.5%, which is 47.6% higher than the previous best-performing method. The method maintains computational efficiency with a Central Processing Unit (CPU) load of 22.79% and a computation time of 8.4 ms, evaluated on a Commercial off-the-Shelf (CotS) Intel i7-1165G7, making it suitable for real-time robotic applications. These results highlight the potential of Predictive Spliner to enhance the performance and safety of autonomous racing vehicles. The code for Predictive Spliner is available at: https://github.com/ForzaETH/predictive-spliner.
comment: Submitted to RA-L
State Observer for the Fourth-order Model of a Salient Pole Synchronous Generator with Stator Losses: Known and Partially Unknown Input Cases
In this paper we study the question of how to reconstruct the state of a power system using Phasor Measurement Units (PMUs). In our previous research we proved that this question has an affirmative answer imposing some rather strict structural assumptions: namely, neglecting the generator rotors saliency and assuming that the stator resistance of the synchronous generator is zero. It was shown in simulations that the performance of the proposed observer was sensitive to these assumptions, observing a transient quality degradation for realistic simulations not imposing these assumptions. Moreover, it was assumed in our previous work that the mechanical power and the field voltage are available for measurement, a scenario that it is not always realistic. In this paper we accomplish two ambitious objectives. First, we propose a new observer that does not impose the simplifying assumptions on the generator model. Secondly, we consider the more realistic scenario where only mechanical power is available for measurement. That is, we solve a problem of state reconstruction of a nonlinear system with partially known input measurements -- that is well-known to be a very challenging task. The design of the first observer relies on two recent developments proposed by the authors, a parameter estimation based approach to the problem of state estimation and the use of the Dynamic Regressor Extension and Mixing (DREM) technique to estimate these parameters. The use of DREM allows us to overcome the problem of lack of persistent excitation that stymies the application of standard parameter estimation designs. On the other hand, the observer for the partial input measurement scenario relies on the clever exploitation of the systems model. Simulation results illustrates the good performance of the proposed observers.
An Optimized H5 Hysteresis Current Control with Clamped Diodes in Transformer-less Grid-PV Inverter
With the rise of renewable energy penetration in the grid, photovoltaic (PV) panels are connected to the grid via inverters to supply solar energy. Transformer-less grid-tied PV inverters are gaining popularity because of their improved efficiency, reduced size, and lower costs. However, they can induce a path for leakage currents between the PV and the grid part due to the absence of galvanic isolation between them. This leads to serious electromagnetic interference, loss in efficiency and safety concerns. The leakage current is primarily influenced by the nature of the common mode voltage (CMV), which is determined by the switching techniques of the inverter. In this paper, a novel inverter topology of Hysteresis Controlled H5 with Two Clamping Diodes (HCH5-D2) has been derived. The HCH5-D2 topology helps to decouple the AC part (Grid) and DC part (PV) during the freewheeling to make the CMV constant and in turn, reduces the leakage current. Also, the additional diodes help to reduce the voltage spikes generated during the freewheeling period and maintain the CMV at a constant value. Finally, a 2.2kW grid-connected single-phase HCH5-D2 PV inverter system's MATLAB simulation has been presented with better results when compared with a traditional H4 inverter.
Physics-Informed GNN for non-linear constrained optimization: PINCO a solver for the AC-optimal power flow
The energy transition is driving the integration of large shares of intermittent power sources in the electric power grid. Therefore, addressing the AC optimal power flow (AC-OPF) effectively becomes increasingly essential. The AC-OPF, which is a fundamental optimization problem in power systems, must be solved more frequently to ensure the safe and cost-effective operation of power systems. Due to its non-linear nature, AC-OPF is often solved in its linearized form, despite inherent inaccuracies. Non-linear solvers, such as the interior point method, are typically employed to solve the full OPF problem. However, these iterative methods may not converge for large systems and do not guarantee global optimality. This work explores a physics-informed graph neural network, PINCO, to solve the AC-OPF. We demonstrate that this method provides accurate solutions in a fraction of the computational time when compared to the established non-linear programming solvers. Remarkably, PINCO generalizes effectively across a diverse set of loading conditions in the power system. We show that our method can solve the AC-OPF without violating inequality constraints. Furthermore, it can function both as a solver and as a hybrid universal function approximator. Moreover, the approach can be easily adapted to different power systems with minimal adjustments to the hyperparameters, including systems with multiple generators at each bus. Overall, this work demonstrates an advancement in the field of power system optimization to tackle the challenges of the energy transition. The code and data utilized in this paper are available at https://anonymous.4open.science/r/opf_pinn_iclr-B83E/.
Smart energy management: process structure-based hybrid neural networks for optimal scheduling and economic predictive control in integrated systems
Integrated energy systems (IESs) are complex systems consisting of diverse operating units spanning multiple domains. To address its operational challenges, we propose a physics-informed hybrid time-series neural network (NN) surrogate to predict the dynamic performance of IESs across multiple time scales. This neural network-based modeling approach develops time-series multi-layer perceptrons (MLPs) for the operating units and integrates them with prior process knowledge about system structure and fundamental dynamics. This integration forms three hybrid NNs (long-term, slow, and fast MLPs) that predict the entire system dynamics across multiple time scales. Leveraging these MLPs, we design an NN-based scheduler and an NN-based economic model predictive control (NEMPC) framework to meet global operational requirements: rapid electrical power responsiveness to operators requests, adequate cooling supply to customers, and increased system profitability, while addressing the dynamic time-scale multiplicity present in IESs. The proposed day-ahead scheduler is formulated using the ReLU network-based MLP, which effectively represents IES performance under a broad range of conditions from a long-term perspective. The scheduler is then exactly recast into a mixed-integer linear programming problem for efficient evaluation. The real-time NEMPC, based on slow and fast MLPs, comprises two sequential distributed control agents: a slow NEMPC for the cooling-dominant subsystem with slower transient responses and a fast NEMPC for the power-dominant subsystem with faster responses. Extensive simulations demonstrate that the developed scheduler and NEMPC schemes outperform their respective benchmark scheduler and controller by about 25% and 40%. Together, they enhance overall system performance by over 70% compared to benchmark approaches.
Transient-Safe and Attack-Resilient Secondary Control in AC Microgrids Under Polynomially Unbounded FDI Attacks
This letter proposes a novel, fully distributed, transient-safe resilient secondary control strategies for AC microgrids, addressing unbounded false data injection (FDI) attacks on control input channels. Unlike existing methods that focus primarily on steady-state convergence, our approach guarantees transient safety, ensuring that system states remain within predefined safety bounds even during attack initiation a critical aspect overlooked in prior research. Given the reduction of network inertia by increasing the penetration of inverted-based renewables, large overshooting and intense fluctuations are more likely to occur during transients caused by disturbances and cyber-attacks. To mitigate these risks, the proposed control method enhances defense capabilities against polynomially unbounded FDI attacks, maintaining safe system trajectories for both frequency and voltage throughout the transient response. Through rigorous Lyapunov-based stability analysis, we formally certify the strategies to achieve uniformly ultimately bounded (UUB) convergence in frequency and voltage regulation, and active power sharing across multi-inverter-based AC microgrids. Numerical simulation studies verify the effectiveness of the proposed control protocols, demonstrating improved system reliability, safety and resilience under adverse conditions.
A Universal Formulation for Path-Parametric Planning and Control
This work presents a unified framework for path-parametric planning and control. This formulation is universal as it standardizes the entire spectrum of path-parametric techniques -- from traditional path following to more recent contouring or progress-maximizing Model Predictive Control and Reinforcement Learning -- under a single framework. The ingredients underlying this universality are twofold: First, we present a compact and efficient technique capable of computing singularity-free, smooth and differentiable moving frames. Second, we derive a spatial path parameterization of the Cartesian coordinates applicable to any arbitrary curve without prior assumptions on its parametric speed or moving frame, and that perfectly interplays with the aforementioned path parameterization method. The combination of these two ingredients leads to a planning and control framework that brings togehter existing path-parametric techniques in literature. Aiming to unify all these approaches, we open source PACOR, a software library that implements the presented content, thereby providing a self-contained toolkit for the formulation of path-parametric planning and control methods.
comment: Preprint. Code: https://github.com/jonarriza96/PACOR
Path Planning and Robust Path Tracking Control of an Automated Parallel Parking Maneuver
Self driving vehicles should be able to perform parallel parking or a similar maneuver successfully. With this motivation, the S shaped maneuverability test of the Ohio driver license examination is chosen here for automatic execution by a self driving vehicle with drive by wire capability and longitudinal and lateral controls. The Ohio maneuverability test requires the driver to start within an area enclosed by four pylons and the driver is asked to go to the left of the fifth pylon directly in front of the vehicle in a smooth and continuous manner while ending in a parallel direction to the initial one. The driver is then asked to go backwards to the starting location of the vehicle without stopping the vehicle or hitting the pylons. As a self driving vehicle should do a much better job repeatably than a driver, a high order polynomial path model is built along with speed profiling to start and stop smoothly at the ends of the path without large longitudinal and lateral accelerations. In contrast to the long horizon, higher speed path planning and path tracking control applications in the literature, this paper treats low speed and very short horizon path planning and path tracking control with stopping and direction reversal. The path is constructed using a segmented polynomial fit optimization routine that guarantees path curvature smoothness. A linear path tracking model is utilized as the basis of the designed control system consisting of a disturbance observer based curvature rejection filter and a speed scheduled, parameter space robust PID controller. Simulation studies indicate that it has better performance compared to other common control systems such as standalone PID controller and combined PID and feedforward control. indicate that it has better performance compared to other common control systems such as standalone PID controller and combined PID and feedforward control.
comment: 12 pages, 19 figures
Structural Constraints for Physics-augmented Learning
When the physics is wrong, physics-informed machine learning becomes physics-misinformed machine learning. A powerful black-box model should not be able to conceal misconceived physics. We propose two criteria that can be used to assert integrity that a hybrid (physics plus black-box) model: 0) the black-box model should be unable to replicate the physical model, and 1) any best-fit hybrid model has the same physical parameter as a best-fit standalone physics model. We demonstrate them for a sample nonlinear mechanical system approximated by its small-signal linearization.
Nonlinear High-Pass Filters
Linear high-pass phenomena matter in signal processing, circuits, and control. In nonlinear systems, however, there is no working definition of high-pass behavior. Any definition would have to agree with the existing theory on linear systems and offer concrete benefits for nonlinear systems above and beyond existing nonlinear theory. To satisfy these two requirements, we propose to define: a nonlinear input-output system is high-pass if its output is stable with respect to the derivative of the input. We first show that definition generalizes high-pass resistor-capacitor circuit analysis to accommodate nonlinear resistors. We then show that this definition generalizes the steady-state disturbance rejection property of integral feedback controllers for linear systems. The theoretical payoff is that low-frequency disturbance rejection is captured by a quantitative, non-asymptotic output cost bound. Finally, we raise theoretical questions about compositionality and noncommutativity of nonlinear operators.
comment: preprint submitted to ACC 2025
Propeller damage detection, classification and estimation in multirotor vehicles
This manuscript details an architecture and training methodology for a data-driven framework aimed at detecting, identifying, and quantifying damage in the propeller blades of multirotor Unmanned Aerial Vehicles. By substituting one propeller with a damaged counterpart-encompassing three distinct damage types of varying severity-real flight data was collected. This data was then used to train a composite model, comprising both classifiers and neural networks, capable of accurately identifying the type of failure, estimating damage severity, and pinpointing the affected rotor. The data employed for this analysis was exclusively sourced from inertial measurements and control command inputs, ensuring adaptability across diverse multirotor vehicle platforms.
comment: 24 pages, 18 figures, 9 tables
Modeling Buffer Occupancy in bittide Systems
The bittide mechanism enables logically synchronous computation across distributed systems by leveraging the continuous frame transmission inherent to wired networks such as Ethernet. Instead of relying on a global clock, bittide uses a decentralized control system to adjust local clock frequencies, ensuring all nodes operate with a consistent notion of time by utilizing elastic buffers at each node to absorb frequency variations. This paper presents an analysis of the steady-state occupancy of these elastic buffers, a critical factor influencing system latency. Using a fluid model of the bittide system, we prove that buffer occupancy converges and derive an explicit formula for the steady-state value in terms of system parameters, including network topology, physical latencies, and controller gains. This analysis provides valuable insights for optimizing buffer sizes and minimizing latency in bittide-based distributed systems.
Synthesizing Interpretable Control Policies through Large Language Model Guided Search
The combination of Large Language Models (LLMs), systematic evaluation, and evolutionary algorithms has enabled breakthroughs in combinatorial optimization and scientific discovery. We propose to extend this powerful combination to the control of dynamical systems, generating interpretable control policies capable of complex behaviors. With our novel method, we represent control policies as programs in standard languages like Python. We evaluate candidate controllers in simulation and evolve them using a pre-trained LLM. Unlike conventional learning-based control techniques, which rely on black box neural networks to encode control policies, our approach enhances transparency and interpretability. We still take advantage of the power of large AI models, but leverage it at the policy design phase, ensuring that all system components remain interpretable and easily verifiable at runtime. Additionally, the use of standard programming languages makes it straightforward for humans to finetune or adapt the controllers based on their expertise and intuition. We illustrate our method through its application to the synthesis of an interpretable control policy for the pendulum swing-up and the ball in cup tasks. We make the code available at https://github.com/muellerlab/synthesizing_interpretable_control_policies.git
comment: 8 pages, 7 figures, conference paper
Evaluating internal and external dissonance of belief dynamics in social systems
Belief dynamics are fundamental to human behavior and social coordination. Individuals rely on accurate beliefs to make decisions, and shared beliefs form the basis of successful cooperation. Traditional studies often examined beliefs in isolation, but recent perspectives suggest beliefs operate as interconnected systems, both within individuals and across social networks. To better understand belief dynamics, we propose an extension of Galesic et al.'s model, which allows individuals to weigh internal and social dissonance based on belief certainty. Our model suggests that belief convergence occurs in two patterns: internal alignment, where beliefs become ideologically consistent but socially disagreeable, or social alignment, where beliefs become socially consistent but internally varied. These results highlight a competition between internal and social belief networks, with one network often dominating. Our findings suggest that belief dynamics tend to settle at extremes, indicating a need for future models to incorporate negative feedback to reflect more nuanced societal belief changes.
comment: 2 pages, 3 figures, conference
Learning-Based Shielding for Safe Autonomy under Unknown Dynamics
Shielding is a common method used to guarantee the safety of a system under a black-box controller, such as a neural network controller from deep reinforcement learning (DRL), with simpler, verified controllers. Existing shielding methods rely on formal verification through Markov Decision Processes (MDPs), assuming either known or finite-state models, which limits their applicability to DRL settings with unknown, continuous-state systems. This paper addresses these limitations by proposing a data-driven shielding methodology that guarantees safety for unknown systems under black-box controllers. The approach leverages Deep Kernel Learning to model the systems' one-step evolution with uncertainty quantification and constructs a finite-state abstraction as an Interval MDP (IMDP). By focusing on safety properties expressed in safe linear temporal logic (safe LTL), we develop an algorithm that computes the maximally permissive set of safe policies on the IMDP, ensuring avoidance of unsafe states. The algorithms soundness and computational complexity are demonstrated through theoretical proofs and experiments on nonlinear systems, including a high-dimensional autonomous spacecraft scenario.
comment: 8 pages, 3 figures
Who should pay for frequency-containment ancillary services? Making responsible units bear the cost to shape investment in generation and loads
While the operating cost of electricity grids based on thermal generation was largely driven by the cost of fuel, as renewable penetration increases, ancillary services represent an increasingly large proportion of the running costs. Electric frequency is an important magnitude in highly renewable grids, as it becomes more volatile and therefore the cost related to maintaining it within safe bounds has significantly increased. So far, costs for frequency-containment ancillary services have been socialised in most countries, but it has become relevant to rethink this regulatory arrangement. In this paper, we discuss the issue of cost allocation for these services, highlighting the need to evolve towards a causation-based regulatory framework. We argue that parties responsible for creating the need for ancillary services should bear these costs. However, this would imply an important change in electricity market policy, therefore it is necessary to understand the impact on current and future investments on generation, as well as on electricity tariffs. Here we provide a mostly qualitative analysis of this issue, defining guidelines for practical implementation and further study.
comment: Published in journal Energy Policy
Integrated Optimal Fast Charging and Active Thermal Management of Lithium-Ion Batteries in Extreme Ambient Temperatures
This paper presents an integrated control strategy for optimal fast charging and active thermal management of Lithium-ion batteries in extreme ambient temperatures, striking a balance between charging speed and battery health. A control-oriented thermal-NDC (nonlinear double-capacitor) battery model is proposed to describe the electrical and thermal dynamics, incorporating the effects of both an active thermal source and ambient temperature. A state-feedback model predictive control algorithm is then developed for optimal fast charging and active thermal management. Numerical experiments validate the algorithm under extreme temperatures, showing that the proposed algorithm can energy-efficiently adjust the battery temperature, thereby balancing charging speed and battery health. Additionally, an output-feedback model predictive control algorithm with an extended Kalman filter is proposed for battery charging when states are partially measurable. Numerical experiments validate the effectiveness under extreme temperatures.
Barycentric rational approximation for learning the index of a dynamical system from limited data
We consider the task of data-driven identification of dynamical systems, specifically for systems whose behavior at large frequencies is non-standard, as encoded by a non-trivial relative degree of the transfer function or, alternatively, a non-trivial index of a corresponding realization as a descriptor system. We develop novel surrogate modeling strategies that allow state-of-the-art rational approximation algorithms (e.g., AAA and vector fitting) to better handle data coming from such systems with non-trivial relative degree. Our contribution is twofold. On one hand, we describe a strategy to build rational surrogate models with prescribed relative degree, with the objective of mirroring the high-frequency behavior of the high-fidelity problem, when known. The surrogate model's desired degree is achieved through constraints on its barycentric coefficients, rather than through ad-hoc modifications of the rational form. On the other hand, we present a degree-identification routine that allows one to estimate the unknown relative degree of a system from low-frequency data. By identifying the degree of the system that generated the data, we can build a surrogate model that, in addition to matching the data well (at low frequencies), has enhanced extrapolation capabilities (at high frequencies). We showcase the effectiveness and robustness of the newly proposed method through a suite of numerical tests.
comment: 20 pages, 5 figures
An active learning method for solving competitive multi-agent decision-making and control problems
To identify a stationary action profile for a population of competitive agents, each executing private strategies, we introduce a novel active-learning scheme where a centralized external observer (or entity) can probe the agents' reactions and recursively update simple local parametric estimates of the action-reaction mappings. Under very general working assumptions (not even assuming that a stationary profile exists), sufficient conditions are established to assess the asymptotic properties of the proposed active learning methodology so that, if the parameters characterizing the action-reaction mappings converge, a stationary action profile is achieved. Such conditions hence act also as certificates for the existence of such a profile. Extensive numerical simulations involving typical competitive multi-agent control and decision-making problems illustrate the practical effectiveness of the proposed learning-based approach.
comment: Python package available at https://github.com/bemporad/gnep-learn
CBF-LLM: Safe Control for LLM Alignment
This paper proposes a control-based framework for aligning large language models (LLMs) by leveraging a control barrier function (CBF) to ensure user-desirable text generation. The presented framework applies the safety filter, designed based on the CBF, to the output generation of the baseline LLM, i.e., the sequence of the token, with the aim of intervening in the generated text. The overall text-generation system is implemented with Llama 3 and a RoBERTa model, and the source code is available at https://github.com/Mya-Mya/CBF-LLM. The experiment demonstrates its control ability and effectiveness in reducing the number of interventions needed for user-specified alignment tasks.
Auto-Multilift: Distributed Learning and Control for Cooperative Load Transportation With Quadrotors
Designing motion control and planning algorithms for multilift systems remains challenging due to the complexities of dynamics, collision avoidance, actuator limits, and scalability. Existing methods that use optimization and distributed techniques effectively address these constraints and scalability issues. However, they often require substantial manual tuning, leading to suboptimal performance. This paper proposes Auto-Multilift, a novel framework that automates the tuning of model predictive controllers (MPCs) for multilift systems. We model the MPC cost functions with deep neural networks (DNNs), enabling fast online adaptation to various scenarios. We develop a distributed policy gradient algorithm to train these DNNs efficiently in a closed-loop manner. Central to our algorithm is distributed sensitivity propagation, which is built on fully exploiting the unique dynamic couplings within the multilift system. It parallelizes gradient computation across quadrotors and focuses on actual system state sensitivities relative to key MPC parameters. Extensive simulations demonstrate favorable scalability to a large number of quadrotors. Our method outperforms a state-of-the-art open-loop MPC tuning approach by effectively learning adaptive MPCs from trajectory tracking errors. It also excels in learning an adaptive reference for reconfiguring the system when traversing multiple narrow slots.
Efficient Shield Synthesis via State-Space Transformation
We consider the problem of synthesizing safety strategies for control systems, also known as shields. Since the state space is infinite, shields are typically computed over a finite-state abstraction, with the most common abstraction being a rectangular grid. However, for many systems, such a grid does not align well with the safety property or the system dynamics. That is why a coarse grid is rarely sufficient, but a fine grid is typically computationally infeasible to obtain. In this paper, we show that appropriate state-space transformations can still allow to use a coarse grid at almost no computational overhead. We demonstrate in three case studies that our transformation-based synthesis outperforms a standard synthesis by several orders of magnitude. In the first two case studies, we use domain knowledge to select a suitable transformation. In the third case study, we instead report on results in engineering a transformation without domain knowledge.
A Moreau Envelope Approach for LQR Meta-Policy Estimation
We study the problem of policy estimation for the Linear Quadratic Regulator (LQR) in discrete-time linear time-invariant uncertain dynamical systems. We propose a Moreau Envelope-based surrogate LQR cost, built from a finite set of realizations of the uncertain system, to define a meta-policy efficiently adjustable to new realizations. Moreover, we design an algorithm to find an approximate first-order stationary point of the meta-LQR cost function. Numerical results show that the proposed approach outperforms naive averaging of controllers on new realizations of the linear system. We also provide empirical evidence that our method has better sample complexity than Model-Agnostic Meta-Learning (MAML) approaches.
comment: Accepted for presentation at Conference on Decision and Control 2024 (CDC'24)
Adaptive Step Duration for Precise Foot Placement: Achieving Robust Bipedal Locomotion on Terrains with Restricted Footholds ICRA 2025
Traditional one-step preview planning algorithms for bipedal locomotion struggle to generate viable gaits when walking across terrains with restricted footholds, such as stepping stones. To overcome such limitations, this paper introduces a novel multi-step preview foot placement planning algorithm based on the step-to-step discrete evolution of the Divergent Component of Motion (DCM) of walking robots. Our proposed approach adaptively changes the step duration and the swing foot trajectory for optimal foot placement under constraints, thereby enhancing the long-term stability of the robot and significantly improving its ability to navigate environments with tight constraints on viable footholds. We demonstrate its effectiveness through various simulation scenarios with complex stepping-stone configurations and external perturbations. These tests underscore its improved performance for navigating foothold-restricted terrains, even with external disturbances.
comment: 7 pages, 7 figures, submitted to ICRA 2025, for associated simulation video, see https://youtu.be/DjH69m1kbnM
Koopman Analysis of the Singularly-Perturbed van der Pol Oscillator
The Koopman operator framework holds promise for spectral analysis of nonlinear dynamical systems based on linear operators. Eigenvalues and eigenfunctions of the Koopman operator, so-called Koopman eigenvalues and Koopman eigenfunctions, respectively, mirror global properties of the system's flow. In this paper we perform the Koopman analysis of the singularly-perturbed van der Pol system. First, we show the spectral signature depending on singular perturbation: how two Koopman {principal} eigenvalues are ordered and what distinct shapes emerge in their associated Koopman eigenfunctions. Second, we discuss the singular limit of the Koopman operator, which is derived through the concatenation of Koopman operators for the fast and slow subsystems. From the spectral properties of the Koopman operator for the {singularly}-perturbed system and the singular limit, we suggest that the Koopman eigenfunctions inherit geometric properties of the singularly-perturbed system. These results are applicable to general planar singularly-perturbed systems with stable limit cycles.
comment: 21 pages, 10 figures
Risk of Cascading Collisions in Network of Vehicles with Delayed Communication
This paper establishes and explores a framework to analyze the risk of cascading failures in a platoon of autonomous vehicles, accounting for communication time-delays and input uncertainty. Our proposed framework yields closed-form expressions for cascading collisions, which we quantify using the coherent Average Value-at-Risk ($\AVAR$) to assess the cascading effect of vehicle collisions within the platoon. We investigate how factors such as network connectivity, system dynamics, communication delays, and uncertainty contribute to the emergence of cascading failures. Our findings are extended to standard communication graphs with symmetries, allowing us to evaluate the risk of cascading collisions from a platoon design perspective. Furthermore, by discovering the boundedness of the inter-vehicle distances, we reveal the best achievable risk of cascading collision with general graph topologies, which is further specified for special communication graph, such as the complete graph. Our theoretical results pave the way for the development of a safety-aware framework aimed at mitigating the risk of cascading collisions in vehicle platoons.
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy. We prove that this method results in the maximal robust RAS set in the absence of training errors and demonstrate that it enables RAS in complex environments, scales to high-dimensional systems, and achieves higher success rates for the RAS task compared to previous methods, validated through one simulation and two high-dimensional experiments.
Systems and Control (EESS)
Upgrading SPHERE with the second stage AO system SAXO+: frequency-based data-driven controller for adaptive optics
This study introduces a novel frequency-based data-driven controller for adaptive optics, using power spectral density for optimization while ensuring stability criteria. It addresses disturbance rejection, command amplitude constraints and system transfer functions through convex optimization to obtain an optimal control in an infinite input response filter form. Evaluated within the SAXO+ project, it demonstrates efficacy under diverse atmospheric conditions and operational scenarios. The proposed controller is tested in both standard and disentangled adaptive optics schemes, showcasing its adaptability and performance. Experimental validation is conducted using the COMPASS simulation tool, affirming the controller's promise for enhancing adaptive optics systems in real-world applications.
AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search
Quantum computers have the potential to outperform classical computers in important tasks such as optimization and number factoring. They are characterized by limited connectivity, which necessitates the routing of their computational bits, known as qubits, to specific locations during program execution to carry out quantum operations. Traditionally, the NP-hard optimization problem of minimizing the routing overhead has been addressed through sub-optimal rule-based routing techniques with inherent human biases embedded within the cost function design. This paper introduces a solution that integrates Monte Carlo Tree Search (MCTS) with Reinforcement Learning (RL). Our RL-based router, called AlphaRouter, outperforms the current state-of-the-art routing methods and generates quantum programs with up to $20\%$ less routing overhead, thus significantly enhancing the overall efficiency and feasibility of quantum computing.
comment: 11 pages, 11 figures, International Conference on Quantum Computing and Engineering - QCE24
Reinforcement Learning Control for Autonomous Hydraulic Material Handling Machines with Underactuated Tools IROS 2024
The precise and safe control of heavy material handling machines presents numerous challenges due to the hard-to-model hydraulically actuated joints and the need for collision-free trajectory planning with a free-swinging end-effector tool. In this work, we propose an RL-based controller that commands the cabin joint and the arm simultaneously. It is trained in a simulation combining data-driven modeling techniques with first-principles modeling. On the one hand, we employ a neural network model to capture the highly nonlinear dynamics of the upper carriage turn hydraulic motor, incorporating explicit pressure prediction to handle delays better. On the other hand, we model the arm as velocity-controllable and the free-swinging end-effector tool as a damped pendulum using first principles. This combined model enhances our simulation environment, enabling the training of RL controllers that can be directly transferred to the real machine. Designed to reach steady-state Cartesian targets, the RL controller learns to leverage the hydraulic dynamics to improve accuracy, maintain high speeds, and minimize end-effector tool oscillations. Our controller, tested on a mid-size prototype material handler, is more accurate than an inexperienced operator and causes fewer tool oscillations. It demonstrates competitive performance even compared to an experienced professional driver.
comment: Presented at IROS 2024, Abu Dhabi, as oral presentation
Function Gradient Approximation with Random Shallow ReLU Networks with Control Applications
Neural networks are widely used to approximate unknown functions in control. A common neural network architecture uses a single hidden layer (i.e. a shallow network), in which the input parameters are fixed in advance and only the output parameters are trained. The typical formal analysis asserts that if output parameters exist to approximate the unknown function with sufficient accuracy, then desired control performance can be achieved. A long-standing theoretical gap was that no conditions existed to guarantee that, for the fixed input parameters, required accuracy could be obtained by training the output parameters. Our recent work has partially closed this gap by demonstrating that if input parameters are chosen randomly, then for any sufficiently smooth function, with high-probability there are output parameters resulting in $O((1/m)^{1/2})$ approximation errors, where $m$ is the number of neurons. However, some applications, notably continuous-time value function approximation, require that the network approximates the both the unknown function and its gradient with sufficient accuracy. In this paper, we show that randomly generated input parameters and trained output parameters result in gradient errors of $O((\log(m)/m)^{1/2})$, and additionally, improve the constants from our prior work. We show how to apply the result to policy evaluation problems.
comment: Under Review for American Control Conference, 2025
Safe Learning-Based Optimization of Model Predictive Control: Application to Battery Fast-Charging
Model predictive control (MPC) is a powerful tool for controlling complex nonlinear systems under constraints, but often struggles with model uncertainties and the design of suitable cost functions. To address these challenges, we discuss an approach that integrates MPC with safe Bayesian optimization to optimize long-term closed-loop performance despite significant model-plant mismatches. By parameterizing the MPC stage cost function using a radial basis function network, we employ Bayesian optimization as a multi-episode learning strategy to tune the controller without relying on precise system models. This method mitigates conservativeness introduced by overly cautious soft constraints in the MPC cost function and provides probabilistic safety guarantees during learning, ensuring that safety-critical constraints are met with high probability. As a practical application, we apply our approach to fast charging of lithium-ion batteries, a challenging task due to the complicated battery dynamics and strict safety requirements, subject to the requirement to be implementable in real time. Simulation results demonstrate that, in the context of model-plant mismatch, our method reduces charging times compared to traditional MPC methods while maintaining safety. This work extends previous research by emphasizing closed-loop constraint satisfaction and offers a promising solution for enhancing performance in systems where model uncertainties and safety are critical concerns.
comment: 7 pages, 4 figures, submitted to ACC 2025
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control
While MPC enables nonlinear feedback control by solving an optimal control problem at each timestep, the computational burden tends to be significantly large, making it difficult to optimize a policy within the control period. To address this issue, one possible approach is to utilize terminal value learning to reduce computational costs. However, the learned value cannot be used for other tasks in situations where the task dynamically changes in the original MPC setup. In this study, we develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization while reducing computational time. Furthermore, by using a hierarchical control structure that allows the upper-level trajectory planner to output appropriate goal-conditioned trajectories, we demonstrate that a robot model is able to generate diverse motions. We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control; thus, the robot successfully tracks a target trajectory on sloped terrain.
comment: 16 pages, 9 figures
Cloud-Based Scheduling Mechanism for Scalable and Resource-Efficient Centralized Controllers
This paper proposes a novel approach to address the challenges of deploying complex robotic software in large-scale systems, i.e., Centralized Nonlinear Model Predictive Controllers (CNMPCs) for multi-agent systems. The proposed approach is based on a Kubernetes-based scheduling mechanism designed to monitor and optimize the operation of CNMPCs, while addressing the scalability limitation of centralized control schemes. By leveraging a cluster in a real-time cloud environment, the proposed mechanism effectively offloads the computational burden of CNMPCs. Through experiments, we have demonstrated the effectiveness and performance of our system, especially in scenarios where the number of robots is subject to change. Our work contributes to the advancement of cloud-based control strategies and lays the foundation for enhanced performance in cloud-controlled robotic systems.
comment: 7 pages, 6 figures, IECON 2024
Active Inference for Closed-loop transmit beamsteering in Fetal Doppler Ultrasound
Doppler ultrasound is widely used to monitor fetal heart rate during labor and pregnancy. Unfortunately, it is highly sensitive to fetal and maternal movements, which can cause the displacement of the fetal heart with respect to the ultrasound beam, in turn reducing the Doppler signal-to-noise ratio and leading to erratic, noisy, or missing heart rate readings. To tackle this issue, we augment the conventional Doppler ultrasound system with a rational agent that autonomously steers the ultrasound beam to track the position of the fetal heart. The proposed cognitive ultrasound system leverages a sequential Monte Carlo method to infer the fetal heart position from the power Doppler signal, and employs a greedy information-seeking criterion to select the steering angle that minimizes the positional uncertainty for future timesteps. The fetal heart rate is then calculated using the Doppler signal at the estimated fetal heart position. Our results show that the system can accurately track the fetal heart position across challenging signal-to-noise ratio scenarios, mainly thanks to its dynamic transmit beam steering capability. Additionally, we find that optimizing the transmit beamsteering to minimize positional uncertainty also optimizes downstream heart rate estimation performance. In conclusion, this work showcases the power of closed-loop cognitive ultrasound in boosting the capabilities of traditional systems.
Predictive Spliner: Data-Driven Overtaking in Autonomous Racing Using Opponent Trajectory Prediction
Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale autonomous racing platform using Light Detection and Ranging (LiDAR) information to perceive the opponent, Predictive Spliner outperforms State-of-the-Art (SotA) algorithms by overtaking opponents at up to 83.1% of its own speed, being on average 8.4% faster than the previous best-performing method. Additionally, it achieves an average success rate of 84.5%, which is 47.6% higher than the previous best-performing method. The method maintains computational efficiency with a Central Processing Unit (CPU) load of 22.79% and a computation time of 8.4 ms, evaluated on a Commercial off-the-Shelf (CotS) Intel i7-1165G7, making it suitable for real-time robotic applications. These results highlight the potential of Predictive Spliner to enhance the performance and safety of autonomous racing vehicles. The code for Predictive Spliner is available at: https://github.com/ForzaETH/predictive-spliner.
comment: Submitted to RA-L
State Observer for the Fourth-order Model of a Salient Pole Synchronous Generator with Stator Losses: Known and Partially Unknown Input Cases
In this paper we study the question of how to reconstruct the state of a power system using Phasor Measurement Units (PMUs). In our previous research we proved that this question has an affirmative answer imposing some rather strict structural assumptions: namely, neglecting the generator rotors saliency and assuming that the stator resistance of the synchronous generator is zero. It was shown in simulations that the performance of the proposed observer was sensitive to these assumptions, observing a transient quality degradation for realistic simulations not imposing these assumptions. Moreover, it was assumed in our previous work that the mechanical power and the field voltage are available for measurement, a scenario that it is not always realistic. In this paper we accomplish two ambitious objectives. First, we propose a new observer that does not impose the simplifying assumptions on the generator model. Secondly, we consider the more realistic scenario where only mechanical power is available for measurement. That is, we solve a problem of state reconstruction of a nonlinear system with partially known input measurements -- that is well-known to be a very challenging task. The design of the first observer relies on two recent developments proposed by the authors, a parameter estimation based approach to the problem of state estimation and the use of the Dynamic Regressor Extension and Mixing (DREM) technique to estimate these parameters. The use of DREM allows us to overcome the problem of lack of persistent excitation that stymies the application of standard parameter estimation designs. On the other hand, the observer for the partial input measurement scenario relies on the clever exploitation of the systems model. Simulation results illustrates the good performance of the proposed observers.
An Optimized H5 Hysteresis Current Control with Clamped Diodes in Transformer-less Grid-PV Inverter
With the rise of renewable energy penetration in the grid, photovoltaic (PV) panels are connected to the grid via inverters to supply solar energy. Transformer-less grid-tied PV inverters are gaining popularity because of their improved efficiency, reduced size, and lower costs. However, they can induce a path for leakage currents between the PV and the grid part due to the absence of galvanic isolation between them. This leads to serious electromagnetic interference, loss in efficiency and safety concerns. The leakage current is primarily influenced by the nature of the common mode voltage (CMV), which is determined by the switching techniques of the inverter. In this paper, a novel inverter topology of Hysteresis Controlled H5 with Two Clamping Diodes (HCH5-D2) has been derived. The HCH5-D2 topology helps to decouple the AC part (Grid) and DC part (PV) during the freewheeling to make the CMV constant and in turn, reduces the leakage current. Also, the additional diodes help to reduce the voltage spikes generated during the freewheeling period and maintain the CMV at a constant value. Finally, a 2.2kW grid-connected single-phase HCH5-D2 PV inverter system's MATLAB simulation has been presented with better results when compared with a traditional H4 inverter.
Physics-Informed GNN for non-linear constrained optimization: PINCO a solver for the AC-optimal power flow
The energy transition is driving the integration of large shares of intermittent power sources in the electric power grid. Therefore, addressing the AC optimal power flow (AC-OPF) effectively becomes increasingly essential. The AC-OPF, which is a fundamental optimization problem in power systems, must be solved more frequently to ensure the safe and cost-effective operation of power systems. Due to its non-linear nature, AC-OPF is often solved in its linearized form, despite inherent inaccuracies. Non-linear solvers, such as the interior point method, are typically employed to solve the full OPF problem. However, these iterative methods may not converge for large systems and do not guarantee global optimality. This work explores a physics-informed graph neural network, PINCO, to solve the AC-OPF. We demonstrate that this method provides accurate solutions in a fraction of the computational time when compared to the established non-linear programming solvers. Remarkably, PINCO generalizes effectively across a diverse set of loading conditions in the power system. We show that our method can solve the AC-OPF without violating inequality constraints. Furthermore, it can function both as a solver and as a hybrid universal function approximator. Moreover, the approach can be easily adapted to different power systems with minimal adjustments to the hyperparameters, including systems with multiple generators at each bus. Overall, this work demonstrates an advancement in the field of power system optimization to tackle the challenges of the energy transition. The code and data utilized in this paper are available at https://anonymous.4open.science/r/opf_pinn_iclr-B83E/.
Smart energy management: process structure-based hybrid neural networks for optimal scheduling and economic predictive control in integrated systems
Integrated energy systems (IESs) are complex systems consisting of diverse operating units spanning multiple domains. To address its operational challenges, we propose a physics-informed hybrid time-series neural network (NN) surrogate to predict the dynamic performance of IESs across multiple time scales. This neural network-based modeling approach develops time-series multi-layer perceptrons (MLPs) for the operating units and integrates them with prior process knowledge about system structure and fundamental dynamics. This integration forms three hybrid NNs (long-term, slow, and fast MLPs) that predict the entire system dynamics across multiple time scales. Leveraging these MLPs, we design an NN-based scheduler and an NN-based economic model predictive control (NEMPC) framework to meet global operational requirements: rapid electrical power responsiveness to operators requests, adequate cooling supply to customers, and increased system profitability, while addressing the dynamic time-scale multiplicity present in IESs. The proposed day-ahead scheduler is formulated using the ReLU network-based MLP, which effectively represents IES performance under a broad range of conditions from a long-term perspective. The scheduler is then exactly recast into a mixed-integer linear programming problem for efficient evaluation. The real-time NEMPC, based on slow and fast MLPs, comprises two sequential distributed control agents: a slow NEMPC for the cooling-dominant subsystem with slower transient responses and a fast NEMPC for the power-dominant subsystem with faster responses. Extensive simulations demonstrate that the developed scheduler and NEMPC schemes outperform their respective benchmark scheduler and controller by about 25% and 40%. Together, they enhance overall system performance by over 70% compared to benchmark approaches.
Transient-Safe and Attack-Resilient Secondary Control in AC Microgrids Under Polynomially Unbounded FDI Attacks
This letter proposes a novel, fully distributed, transient-safe resilient secondary control strategies for AC microgrids, addressing unbounded false data injection (FDI) attacks on control input channels. Unlike existing methods that focus primarily on steady-state convergence, our approach guarantees transient safety, ensuring that system states remain within predefined safety bounds even during attack initiation a critical aspect overlooked in prior research. Given the reduction of network inertia by increasing the penetration of inverted-based renewables, large overshooting and intense fluctuations are more likely to occur during transients caused by disturbances and cyber-attacks. To mitigate these risks, the proposed control method enhances defense capabilities against polynomially unbounded FDI attacks, maintaining safe system trajectories for both frequency and voltage throughout the transient response. Through rigorous Lyapunov-based stability analysis, we formally certify the strategies to achieve uniformly ultimately bounded (UUB) convergence in frequency and voltage regulation, and active power sharing across multi-inverter-based AC microgrids. Numerical simulation studies verify the effectiveness of the proposed control protocols, demonstrating improved system reliability, safety and resilience under adverse conditions.
A Universal Formulation for Path-Parametric Planning and Control
This work presents a unified framework for path-parametric planning and control. This formulation is universal as it standardizes the entire spectrum of path-parametric techniques -- from traditional path following to more recent contouring or progress-maximizing Model Predictive Control and Reinforcement Learning -- under a single framework. The ingredients underlying this universality are twofold: First, we present a compact and efficient technique capable of computing singularity-free, smooth and differentiable moving frames. Second, we derive a spatial path parameterization of the Cartesian coordinates applicable to any arbitrary curve without prior assumptions on its parametric speed or moving frame, and that perfectly interplays with the aforementioned path parameterization method. The combination of these two ingredients leads to a planning and control framework that brings togehter existing path-parametric techniques in literature. Aiming to unify all these approaches, we open source PACOR, a software library that implements the presented content, thereby providing a self-contained toolkit for the formulation of path-parametric planning and control methods.
comment: Preprint. Code: https://github.com/jonarriza96/PACOR
Path Planning and Robust Path Tracking Control of an Automated Parallel Parking Maneuver
Self driving vehicles should be able to perform parallel parking or a similar maneuver successfully. With this motivation, the S shaped maneuverability test of the Ohio driver license examination is chosen here for automatic execution by a self driving vehicle with drive by wire capability and longitudinal and lateral controls. The Ohio maneuverability test requires the driver to start within an area enclosed by four pylons and the driver is asked to go to the left of the fifth pylon directly in front of the vehicle in a smooth and continuous manner while ending in a parallel direction to the initial one. The driver is then asked to go backwards to the starting location of the vehicle without stopping the vehicle or hitting the pylons. As a self driving vehicle should do a much better job repeatably than a driver, a high order polynomial path model is built along with speed profiling to start and stop smoothly at the ends of the path without large longitudinal and lateral accelerations. In contrast to the long horizon, higher speed path planning and path tracking control applications in the literature, this paper treats low speed and very short horizon path planning and path tracking control with stopping and direction reversal. The path is constructed using a segmented polynomial fit optimization routine that guarantees path curvature smoothness. A linear path tracking model is utilized as the basis of the designed control system consisting of a disturbance observer based curvature rejection filter and a speed scheduled, parameter space robust PID controller. Simulation studies indicate that it has better performance compared to other common control systems such as standalone PID controller and combined PID and feedforward control. indicate that it has better performance compared to other common control systems such as standalone PID controller and combined PID and feedforward control.
comment: 12 pages, 19 figures
Structural Constraints for Physics-augmented Learning
When the physics is wrong, physics-informed machine learning becomes physics-misinformed machine learning. A powerful black-box model should not be able to conceal misconceived physics. We propose two criteria that can be used to assert integrity that a hybrid (physics plus black-box) model: 0) the black-box model should be unable to replicate the physical model, and 1) any best-fit hybrid model has the same physical parameter as a best-fit standalone physics model. We demonstrate them for a sample nonlinear mechanical system approximated by its small-signal linearization.
Nonlinear High-Pass Filters
Linear high-pass phenomena matter in signal processing, circuits, and control. In nonlinear systems, however, there is no working definition of high-pass behavior. Any definition would have to agree with the existing theory on linear systems and offer concrete benefits for nonlinear systems above and beyond existing nonlinear theory. To satisfy these two requirements, we propose to define: a nonlinear input-output system is high-pass if its output is stable with respect to the derivative of the input. We first show that definition generalizes high-pass resistor-capacitor circuit analysis to accommodate nonlinear resistors. We then show that this definition generalizes the steady-state disturbance rejection property of integral feedback controllers for linear systems. The theoretical payoff is that low-frequency disturbance rejection is captured by a quantitative, non-asymptotic output cost bound. Finally, we raise theoretical questions about compositionality and noncommutativity of nonlinear operators.
comment: preprint submitted to ACC 2025
Propeller damage detection, classification and estimation in multirotor vehicles
This manuscript details an architecture and training methodology for a data-driven framework aimed at detecting, identifying, and quantifying damage in the propeller blades of multirotor Unmanned Aerial Vehicles. By substituting one propeller with a damaged counterpart-encompassing three distinct damage types of varying severity-real flight data was collected. This data was then used to train a composite model, comprising both classifiers and neural networks, capable of accurately identifying the type of failure, estimating damage severity, and pinpointing the affected rotor. The data employed for this analysis was exclusively sourced from inertial measurements and control command inputs, ensuring adaptability across diverse multirotor vehicle platforms.
comment: 24 pages, 18 figures, 9 tables
Modeling Buffer Occupancy in bittide Systems
The bittide mechanism enables logically synchronous computation across distributed systems by leveraging the continuous frame transmission inherent to wired networks such as Ethernet. Instead of relying on a global clock, bittide uses a decentralized control system to adjust local clock frequencies, ensuring all nodes operate with a consistent notion of time by utilizing elastic buffers at each node to absorb frequency variations. This paper presents an analysis of the steady-state occupancy of these elastic buffers, a critical factor influencing system latency. Using a fluid model of the bittide system, we prove that buffer occupancy converges and derive an explicit formula for the steady-state value in terms of system parameters, including network topology, physical latencies, and controller gains. This analysis provides valuable insights for optimizing buffer sizes and minimizing latency in bittide-based distributed systems.
Synthesizing Interpretable Control Policies through Large Language Model Guided Search
The combination of Large Language Models (LLMs), systematic evaluation, and evolutionary algorithms has enabled breakthroughs in combinatorial optimization and scientific discovery. We propose to extend this powerful combination to the control of dynamical systems, generating interpretable control policies capable of complex behaviors. With our novel method, we represent control policies as programs in standard languages like Python. We evaluate candidate controllers in simulation and evolve them using a pre-trained LLM. Unlike conventional learning-based control techniques, which rely on black box neural networks to encode control policies, our approach enhances transparency and interpretability. We still take advantage of the power of large AI models, but leverage it at the policy design phase, ensuring that all system components remain interpretable and easily verifiable at runtime. Additionally, the use of standard programming languages makes it straightforward for humans to finetune or adapt the controllers based on their expertise and intuition. We illustrate our method through its application to the synthesis of an interpretable control policy for the pendulum swing-up and the ball in cup tasks. We make the code available at https://github.com/muellerlab/synthesizing_interpretable_control_policies.git
comment: 8 pages, 7 figures, conference paper
Evaluating internal and external dissonance of belief dynamics in social systems
Belief dynamics are fundamental to human behavior and social coordination. Individuals rely on accurate beliefs to make decisions, and shared beliefs form the basis of successful cooperation. Traditional studies often examined beliefs in isolation, but recent perspectives suggest beliefs operate as interconnected systems, both within individuals and across social networks. To better understand belief dynamics, we propose an extension of Galesic et al.'s model, which allows individuals to weigh internal and social dissonance based on belief certainty. Our model suggests that belief convergence occurs in two patterns: internal alignment, where beliefs become ideologically consistent but socially disagreeable, or social alignment, where beliefs become socially consistent but internally varied. These results highlight a competition between internal and social belief networks, with one network often dominating. Our findings suggest that belief dynamics tend to settle at extremes, indicating a need for future models to incorporate negative feedback to reflect more nuanced societal belief changes.
comment: 2 pages, 3 figures, conference
Learning-Based Shielding for Safe Autonomy under Unknown Dynamics
Shielding is a common method used to guarantee the safety of a system under a black-box controller, such as a neural network controller from deep reinforcement learning (DRL), with simpler, verified controllers. Existing shielding methods rely on formal verification through Markov Decision Processes (MDPs), assuming either known or finite-state models, which limits their applicability to DRL settings with unknown, continuous-state systems. This paper addresses these limitations by proposing a data-driven shielding methodology that guarantees safety for unknown systems under black-box controllers. The approach leverages Deep Kernel Learning to model the systems' one-step evolution with uncertainty quantification and constructs a finite-state abstraction as an Interval MDP (IMDP). By focusing on safety properties expressed in safe linear temporal logic (safe LTL), we develop an algorithm that computes the maximally permissive set of safe policies on the IMDP, ensuring avoidance of unsafe states. The algorithms soundness and computational complexity are demonstrated through theoretical proofs and experiments on nonlinear systems, including a high-dimensional autonomous spacecraft scenario.
comment: 8 pages, 3 figures
Who should pay for frequency-containment ancillary services? Making responsible units bear the cost to shape investment in generation and loads
While the operating cost of electricity grids based on thermal generation was largely driven by the cost of fuel, as renewable penetration increases, ancillary services represent an increasingly large proportion of the running costs. Electric frequency is an important magnitude in highly renewable grids, as it becomes more volatile and therefore the cost related to maintaining it within safe bounds has significantly increased. So far, costs for frequency-containment ancillary services have been socialised in most countries, but it has become relevant to rethink this regulatory arrangement. In this paper, we discuss the issue of cost allocation for these services, highlighting the need to evolve towards a causation-based regulatory framework. We argue that parties responsible for creating the need for ancillary services should bear these costs. However, this would imply an important change in electricity market policy, therefore it is necessary to understand the impact on current and future investments on generation, as well as on electricity tariffs. Here we provide a mostly qualitative analysis of this issue, defining guidelines for practical implementation and further study.
comment: Published in journal Energy Policy
Integrated Optimal Fast Charging and Active Thermal Management of Lithium-Ion Batteries in Extreme Ambient Temperatures
This paper presents an integrated control strategy for optimal fast charging and active thermal management of Lithium-ion batteries in extreme ambient temperatures, striking a balance between charging speed and battery health. A control-oriented thermal-NDC (nonlinear double-capacitor) battery model is proposed to describe the electrical and thermal dynamics, incorporating the effects of both an active thermal source and ambient temperature. A state-feedback model predictive control algorithm is then developed for optimal fast charging and active thermal management. Numerical experiments validate the algorithm under extreme temperatures, showing that the proposed algorithm can energy-efficiently adjust the battery temperature, thereby balancing charging speed and battery health. Additionally, an output-feedback model predictive control algorithm with an extended Kalman filter is proposed for battery charging when states are partially measurable. Numerical experiments validate the effectiveness under extreme temperatures.
Barycentric rational approximation for learning the index of a dynamical system from limited data
We consider the task of data-driven identification of dynamical systems, specifically for systems whose behavior at large frequencies is non-standard, as encoded by a non-trivial relative degree of the transfer function or, alternatively, a non-trivial index of a corresponding realization as a descriptor system. We develop novel surrogate modeling strategies that allow state-of-the-art rational approximation algorithms (e.g., AAA and vector fitting) to better handle data coming from such systems with non-trivial relative degree. Our contribution is twofold. On one hand, we describe a strategy to build rational surrogate models with prescribed relative degree, with the objective of mirroring the high-frequency behavior of the high-fidelity problem, when known. The surrogate model's desired degree is achieved through constraints on its barycentric coefficients, rather than through ad-hoc modifications of the rational form. On the other hand, we present a degree-identification routine that allows one to estimate the unknown relative degree of a system from low-frequency data. By identifying the degree of the system that generated the data, we can build a surrogate model that, in addition to matching the data well (at low frequencies), has enhanced extrapolation capabilities (at high frequencies). We showcase the effectiveness and robustness of the newly proposed method through a suite of numerical tests.
comment: 20 pages, 5 figures
An active learning method for solving competitive multi-agent decision-making and control problems
To identify a stationary action profile for a population of competitive agents, each executing private strategies, we introduce a novel active-learning scheme where a centralized external observer (or entity) can probe the agents' reactions and recursively update simple local parametric estimates of the action-reaction mappings. Under very general working assumptions (not even assuming that a stationary profile exists), sufficient conditions are established to assess the asymptotic properties of the proposed active learning methodology so that, if the parameters characterizing the action-reaction mappings converge, a stationary action profile is achieved. Such conditions hence act also as certificates for the existence of such a profile. Extensive numerical simulations involving typical competitive multi-agent control and decision-making problems illustrate the practical effectiveness of the proposed learning-based approach.
comment: Python package available at https://github.com/bemporad/gnep-learn
CBF-LLM: Safe Control for LLM Alignment
This paper proposes a control-based framework for aligning large language models (LLMs) by leveraging a control barrier function (CBF) to ensure user-desirable text generation. The presented framework applies the safety filter, designed based on the CBF, to the output generation of the baseline LLM, i.e., the sequence of the token, with the aim of intervening in the generated text. The overall text-generation system is implemented with Llama 3 and a RoBERTa model, and the source code is available at https://github.com/Mya-Mya/CBF-LLM. The experiment demonstrates its control ability and effectiveness in reducing the number of interventions needed for user-specified alignment tasks.
Auto-Multilift: Distributed Learning and Control for Cooperative Load Transportation With Quadrotors
Designing motion control and planning algorithms for multilift systems remains challenging due to the complexities of dynamics, collision avoidance, actuator limits, and scalability. Existing methods that use optimization and distributed techniques effectively address these constraints and scalability issues. However, they often require substantial manual tuning, leading to suboptimal performance. This paper proposes Auto-Multilift, a novel framework that automates the tuning of model predictive controllers (MPCs) for multilift systems. We model the MPC cost functions with deep neural networks (DNNs), enabling fast online adaptation to various scenarios. We develop a distributed policy gradient algorithm to train these DNNs efficiently in a closed-loop manner. Central to our algorithm is distributed sensitivity propagation, which is built on fully exploiting the unique dynamic couplings within the multilift system. It parallelizes gradient computation across quadrotors and focuses on actual system state sensitivities relative to key MPC parameters. Extensive simulations demonstrate favorable scalability to a large number of quadrotors. Our method outperforms a state-of-the-art open-loop MPC tuning approach by effectively learning adaptive MPCs from trajectory tracking errors. It also excels in learning an adaptive reference for reconfiguring the system when traversing multiple narrow slots.
Efficient Shield Synthesis via State-Space Transformation
We consider the problem of synthesizing safety strategies for control systems, also known as shields. Since the state space is infinite, shields are typically computed over a finite-state abstraction, with the most common abstraction being a rectangular grid. However, for many systems, such a grid does not align well with the safety property or the system dynamics. That is why a coarse grid is rarely sufficient, but a fine grid is typically computationally infeasible to obtain. In this paper, we show that appropriate state-space transformations can still allow to use a coarse grid at almost no computational overhead. We demonstrate in three case studies that our transformation-based synthesis outperforms a standard synthesis by several orders of magnitude. In the first two case studies, we use domain knowledge to select a suitable transformation. In the third case study, we instead report on results in engineering a transformation without domain knowledge.
A Moreau Envelope Approach for LQR Meta-Policy Estimation
We study the problem of policy estimation for the Linear Quadratic Regulator (LQR) in discrete-time linear time-invariant uncertain dynamical systems. We propose a Moreau Envelope-based surrogate LQR cost, built from a finite set of realizations of the uncertain system, to define a meta-policy efficiently adjustable to new realizations. Moreover, we design an algorithm to find an approximate first-order stationary point of the meta-LQR cost function. Numerical results show that the proposed approach outperforms naive averaging of controllers on new realizations of the linear system. We also provide empirical evidence that our method has better sample complexity than Model-Agnostic Meta-Learning (MAML) approaches.
comment: Accepted for presentation at Conference on Decision and Control 2024 (CDC'24)
Adaptive Step Duration for Precise Foot Placement: Achieving Robust Bipedal Locomotion on Terrains with Restricted Footholds ICRA 2025
Traditional one-step preview planning algorithms for bipedal locomotion struggle to generate viable gaits when walking across terrains with restricted footholds, such as stepping stones. To overcome such limitations, this paper introduces a novel multi-step preview foot placement planning algorithm based on the step-to-step discrete evolution of the Divergent Component of Motion (DCM) of walking robots. Our proposed approach adaptively changes the step duration and the swing foot trajectory for optimal foot placement under constraints, thereby enhancing the long-term stability of the robot and significantly improving its ability to navigate environments with tight constraints on viable footholds. We demonstrate its effectiveness through various simulation scenarios with complex stepping-stone configurations and external perturbations. These tests underscore its improved performance for navigating foothold-restricted terrains, even with external disturbances.
comment: 7 pages, 7 figures, submitted to ICRA 2025, for associated simulation video, see https://youtu.be/DjH69m1kbnM
Koopman Analysis of the Singularly-Perturbed van der Pol Oscillator
The Koopman operator framework holds promise for spectral analysis of nonlinear dynamical systems based on linear operators. Eigenvalues and eigenfunctions of the Koopman operator, so-called Koopman eigenvalues and Koopman eigenfunctions, respectively, mirror global properties of the system's flow. In this paper we perform the Koopman analysis of the singularly-perturbed van der Pol system. First, we show the spectral signature depending on singular perturbation: how two Koopman {principal} eigenvalues are ordered and what distinct shapes emerge in their associated Koopman eigenfunctions. Second, we discuss the singular limit of the Koopman operator, which is derived through the concatenation of Koopman operators for the fast and slow subsystems. From the spectral properties of the Koopman operator for the {singularly}-perturbed system and the singular limit, we suggest that the Koopman eigenfunctions inherit geometric properties of the singularly-perturbed system. These results are applicable to general planar singularly-perturbed systems with stable limit cycles.
comment: 21 pages, 10 figures
Risk of Cascading Collisions in Network of Vehicles with Delayed Communication
This paper establishes and explores a framework to analyze the risk of cascading failures in a platoon of autonomous vehicles, accounting for communication time-delays and input uncertainty. Our proposed framework yields closed-form expressions for cascading collisions, which we quantify using the coherent Average Value-at-Risk ($\AVAR$) to assess the cascading effect of vehicle collisions within the platoon. We investigate how factors such as network connectivity, system dynamics, communication delays, and uncertainty contribute to the emergence of cascading failures. Our findings are extended to standard communication graphs with symmetries, allowing us to evaluate the risk of cascading collisions from a platoon design perspective. Furthermore, by discovering the boundedness of the inter-vehicle distances, we reveal the best achievable risk of cascading collision with general graph topologies, which is further specified for special communication graph, such as the complete graph. Our theoretical results pave the way for the development of a safety-aware framework aimed at mitigating the risk of cascading collisions in vehicle platoons.
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy. We prove that this method results in the maximal robust RAS set in the absence of training errors and demonstrate that it enables RAS in complex environments, scales to high-dimensional systems, and achieves higher success rates for the RAS task compared to previous methods, validated through one simulation and two high-dimensional experiments.
Robotics
Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting for Robust Ground-View Scene Rendering
We present a novel-view rendering algorithm, Mode-GS, for ground-robot trajectory datasets. Our approach is based on using anchored Gaussian splats, which are designed to overcome the limitations of existing 3D Gaussian splatting algorithms. Prior neural rendering methods suffer from severe splat drift due to scene complexity and insufficient multi-view observation, and can fail to fix splats on the true geometry in ground-robot datasets. Our method integrates pixel-aligned anchors from monocular depths and generates Gaussian splats around these anchors using residual-form Gaussian decoders. To address the inherent scale ambiguity of monocular depth, we parameterize anchors with per-view depth-scales and employ scale-consistent depth loss for online scale calibration. Our method results in improved rendering performance, based on PSNR, SSIM, and LPIPS metrics, in ground scenes with free trajectory patterns, and achieves state-of-the-art rendering performance on the R3LIVE odometry dataset and the Tanks and Temples dataset.
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
Robot behavior policies trained via imitation learning are prone to failure under conditions that deviate from their training data. Thus, algorithms that monitor learned policies at test time and provide early warnings of failure are necessary to facilitate scalable deployment. We propose Sentinel, a runtime monitoring framework that splits the detection of failures into two complementary categories: 1) Erratic failures, which we detect using statistical measures of temporal action consistency, and 2) task progression failures, where we use Vision Language Models (VLMs) to detect when the policy confidently and consistently takes actions that do not solve the task. Our approach has two key strengths. First, because learned policies exhibit diverse failure modes, combining complementary detectors leads to significantly higher accuracy at failure detection. Second, using a statistical temporal action consistency measure ensures that we quickly detect when multimodal, generative policies exhibit erratic behavior at negligible computational cost. In contrast, we only use VLMs to detect failure modes that are less time-sensitive. We demonstrate our approach in the context of diffusion policies trained on robotic mobile manipulation domains in both simulation and the real world. By unifying temporal consistency detection and VLM runtime monitoring, Sentinel detects 18% more failures than using either of the two detectors alone and significantly outperforms baselines, thus highlighting the importance of assigning specialized detectors to complementary categories of failure. Qualitative results are made available at https://sites.google.com/stanford.edu/sentinel.
comment: Project page: https://sites.google.com/stanford.edu/sentinel . 35 pages, 9 figures. Accepted to the Conference on Robot Learning (CoRL) 2024
Admissibility Over Winning: A New Approach to Reactive Synthesis in Robotics
Reactive synthesis is a framework for modeling and automatically synthesizing strategies in robotics, typically through computing a \emph{winning} strategy in a 2-player game between the robot and the environment. Winning strategies, however, do not always exist, even in some simple cases. In such situations, it is still desirable for the robot to attempt its task rather than "giving up". In this work, we explore the notion of admissibility to define strategies beyond winning, tailored specifically for robotic systems. We introduce an ordering of admissible strategies and define \emph{admissibly rational strategies}, which aim to be winning and cooperative when possible, and non-violating and hopeful when necessary. We present an efficient synthesis algorithm and demonstrate that admissibly rational strategies produce desirable behaviors through case studies.
comment: Preprint. Under Review
Distributed Detection of Adversarial Attacks for Resilient Cooperation of Multi-Robot Systems with Intermittent Communication
This paper concerns the consensus and formation of a network of mobile autonomous agents in adversarial settings where a group of malicious (compromised) agents are subject to deception attacks. In addition, the communication network is arbitrarily time-varying and subject to intermittent connections, possibly imposed by denial-of-service (DoS) attacks. We provide explicit bounds for network connectivity in an integral sense, enabling the characterization of the system's resilience to specific classes of adversarial attacks. We also show that under the condition of connectivity in an integral sense uniformly in time, the system is finite-gain $\mathcal{L}_{p}$ stable and uniformly exponentially fast consensus and formation are achievable, provided malicious agents are detected and isolated from the network. We present a distributed and reconfigurable framework with theoretical guarantees for detecting malicious agents, allowing for the resilient cooperation of the remaining cooperative agents. Simulation studies are provided to illustrate the theoretical findings.
comment: to be published in IEEE
Multi-LED Classification as Pretext For Robot Heading Estimation ICRA
We propose a self-supervised approach for visual robot detection and heading estimation by learning to estimate the states (OFF or ON) of four independent robot-mounted LEDs. Experimental results show a median image-space position error of 14 px and relative heading MAE of 17 degrees, versus a supervised upperbound scoring 10 px and 8 degrees, respectively.
comment: Accepted and presented at ICRA@40
LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation
This paper presents LiteVLoc, a hierarchical visual localization framework that uses a lightweight topo-metric map to represent the environment. The method consists of three sequential modules that estimate camera poses in a coarse-to-fine manner. Unlike mainstream approaches relying on detailed 3D representations, LiteVLoc reduces storage overhead by leveraging learning-based feature matching and geometric solvers for metric pose estimation. A novel dataset for the map-free relocalization task is also introduced. Extensive experiments including localization and navigation in both simulated and real-world scenarios have validate the system's performance and demonstrated its precision and efficiency for large-scale deployment. Code and data will be made publicly available.
comment: 8 pages, 4 figures
A physics-based sensor simulation environment for lunar ground operations
This contribution reports on a software framework that uses physically-based rendering to simulate camera operation in lunar conditions. The focus is on generating synthetic images qualitatively similar to those produced by an actual camera operating on a vehicle traversing and/or actively interacting with lunar terrain, e.g., for construction operations. The highlights of this simulator are its ability to capture (i) light transport in lunar conditions and (ii) artifacts related to the vehicle-terrain interaction, which might include dust formation and transport. The simulation infrastructure is built within an in-house developed physics engine called Chrono, which simulates the dynamics of the deformable terrain-vehicle interaction, as well as fallout of this interaction. The Chrono::Sensor camera model draws on ray tracing and Hapke Photometric Functions. We analyze the performance of the simulator using two virtual experiments featuring digital twins of NASA's VIPER rover navigating a lunar environment, and of the NASA's RASSOR excavator engaged into a digging operation. The sensor simulation solution presented can be used for the design and testing of perception algorithms, or as a component of in-silico experiments that pertain to large lunar operations, e.g., traversability, construction tasks.
comment: 19 pages, 20 figures, 3 tables. This work has been submitted to the 2025 IEEE Aerospace Conference for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
DABI: Evaluation of Data Augmentation Methods Using Downsampling in Bilateral Control-Based Imitation Learning with Images
Autonomous robot manipulation is a complex and continuously evolving robotics field. This paper focuses on data augmentation methods in imitation learning. Imitation learning consists of three stages: data collection from experts, learning model, and execution. However, collecting expert data requires manual effort and is time-consuming. Additionally, as sensors have different data acquisition intervals, preprocessing such as downsampling to match the lowest frequency is necessary. Downsampling enables data augmentation and also contributes to the stabilization of robot operations. In light of this background, this paper proposes the Data Augmentation Method for Bilateral Control-Based Imitation Learning with Images, called "DABI". DABI collects robot joint angles, velocities, and torques at 1000 Hz, and uses images from gripper and environmental cameras captured at 100 Hz as the basis for data augmentation. This enables a tenfold increase in data. In this paper, we collected just 5 expert demonstration datasets. We trained the bilateral control Bi-ACT model with the unaltered dataset and two augmentation methods for comparative experiments and conducted real-world experiments. The results confirmed a significant improvement in success rates, thereby proving the effectiveness of DABI. For additional material, please check https://mertcookimg.github.io/dabi
KISS-Matcher: Fast and Robust Point Cloud Registration Revisited
While global point cloud registration systems have advanced significantly in all aspects, many studies have focused on specific components, such as feature extraction, graph-theoretic pruning, or pose solvers. In this paper, we take a holistic view on the registration problem and develop an open-source and versatile C++ library for point cloud registration, called \textit{KISS-Matcher}. KISS-Matcher combines a novel feature detector, \textit{Faster-PFH}, that improves over the classical fast point feature histogram (FPFH). Moreover, it adopts a $k$-core-based graph-theoretic pruning to reduce the time complexity of rejecting outlier correspondences. Finally, it combines these modules in a complete, user-friendly, and ready-to-use pipeline. As verified by extensive experiments, KISS-Matcher has superior scalability and broad applicability, achieving a substantial speed-up compared to state-of-the-art outlier-robust registration pipelines while preserving accuracy. Our code will be available at \href{https://github.com/MIT-SPARK/KISS-Matcher}{\texttt{https://github.com/MIT-SPARK/KISS-Matcher}}.
comment: 9 pages, 9 figures
The AEIF Data Collection: A Dataset for Infrastructure-Supported Perception Research with Focus on Public Transportation
This paper we present our vision and ongoing work for a novel dataset designed to advance research into the interoperability of intelligent vehicles and infrastructure, specifically aimed at enhancing cooperative perception and interaction in the realm of public transportation. Unlike conventional datasets centered on ego-vehicle data, this approach encompasses both a stationary sensor tower and a moving vehicle, each equipped with cameras, LiDARs, and GNSS, while the vehicle additionally includes an inertial navigation system. Our setup features comprehensive calibration and time synchronization, ensuring seamless and accurate sensor data fusion crucial for studying complex, dynamic scenes. Emphasizing public transportation, the dataset targets to include scenes like bus station maneuvers and driving on dedicated bus lanes, reflecting the specifics of small public buses. We introduce the open-source ".4mse" file format for the new dataset, accompanied by a research kit. This kit provides tools such as ego-motion compensation or LiDAR-to-camera projection enabling advanced research on intelligent vehicle-infrastructure integration. Our approach does not include annotations; however, we plan to implement automatically generated labels sourced from state-of-the-art public repositories. Several aspects are still up for discussion, and timely feedback from the community would be greatly appreciated. A sneak preview on one data frame will be available at a Google Colab Notebook. Moreover, we will use the related GitHub Repository to collect remarks and suggestions.
System-Level Safety Monitoring and Recovery for Perception Failures in Autonomous Vehicles
The safety-critical nature of autonomous vehicle (AV) operation necessitates development of task-relevant algorithms that can reason about safety at the system level and not just at the component level. To reason about the impact of a perception failure on the entire system performance, such task-relevant algorithms must contend with various challenges: complexity of AV stacks, high uncertainty in the operating environments, and the need for real-time performance. To overcome these challenges, in this work, we introduce a Q-network called SPARQ (abbreviation for Safety evaluation for Perception And Recovery Q-network) that evaluates the safety of a plan generated by a planning algorithm, accounting for perception failures that the planning process may have overlooked. This Q-network can be queried during system runtime to assess whether a proposed plan is safe for execution or poses potential safety risks. If a violation is detected, the network can then recommend a corrective plan while accounting for the perceptual failure. We validate our algorithm using the NuPlan-Vegas dataset, demonstrating its ability to handle cases where a perception failure compromises a proposed plan while the corrective plan remains safe. We observe an overall accuracy and recall of 90% while sustaining a frequency of 42Hz on the unseen testing dataset. We compare our performance to a popular reachability-based baseline and analyze some interesting properties of our approach in improving the safety properties of an AV pipeline.
Partially Observable Task and Motion Planning with Uncertainty and Risk Awareness
Integrated task and motion planning (TAMP) has proven to be a valuable approach to generalizable long-horizon robotic manipulation and navigation problems. However, the typical TAMP problem formulation assumes full observability and deterministic action effects. These assumptions limit the ability of the planner to gather information and make decisions that are risk-aware. We propose a strategy for TAMP with Uncertainty and Risk Awareness (TAMPURA) that is capable of efficiently solving long-horizon planning problems with initial-state and action outcome uncertainty, including problems that require information gathering and avoiding undesirable and irreversible outcomes. Our planner reasons under uncertainty at both the abstract task level and continuous controller level. Given a set of closed-loop goal-conditioned controllers operating in the primitive action space and a description of their preconditions and potential capabilities, we learn a high-level abstraction that can be solved efficiently and then refined to continuous actions for execution. We demonstrate our approach on several robotics problems where uncertainty is a crucial factor and show that reasoning under uncertainty in these problems outperforms previously proposed determinized planning, direct search, and reinforcement learning strategies. Lastly, we demonstrate our planner on two real-world robotics problems using recent advancements in probabilistic perception.
Transferable Tactile Transformers for Representation Learning Across Diverse Sensors and Tasks
This paper presents T3: Transferable Tactile Transformers, a framework for tactile representation learning that scales across multi-sensors and multi-tasks. T3 is designed to overcome the contemporary issue that camera-based tactile sensing is extremely heterogeneous, i.e. sensors are built into different form factors, and existing datasets were collected for disparate tasks. T3 captures the shared latent information across different sensor-task pairings by constructing a shared trunk transformer with sensor-specific encoders and task-specific decoders. The pre-training of T3 utilizes a novel Foundation Tactile (FoTa) dataset, which is aggregated from several open-sourced datasets and it contains over 3 million data points gathered from 13 sensors and 11 tasks. FoTa is the largest and most diverse dataset in tactile sensing to date and it is made publicly available in a unified format. Across various sensors and tasks, experiments show that T3 pre-trained with FoTa achieved zero-shot transferability in certain sensor-task pairings, can be further fine-tuned with small amounts of domain-specific data, and its performance scales with bigger network sizes. T3 is also effective as a tactile encoder for long horizon contact-rich manipulation. Results from sub-millimeter multi-pin electronics insertion tasks show that T3 achieved a task success rate 25% higher than that of policies trained with tactile encoders trained from scratch, or 53% higher than without tactile sensing. Data, code, and model checkpoints are open-sourced at https://t3.alanz.info
comment: Accepted to 2024 Conference on Robot Learning (CoRL)
Learning to Estimate the Pose of a Peer Robot in a Camera Image by Predicting the States of its LEDs IROS
We consider the problem of training a fully convolutional network to estimate the relative 6D pose of a robot given a camera image, when the robot is equipped with independent controllable LEDs placed in different parts of its body. The training data is composed by few (or zero) images labeled with a ground truth relative pose and many images labeled only with the true state (\textsc{on} or \textsc{off}) of each of the peer LEDs. The former data is expensive to acquire, requiring external infrastructure for tracking the two robots; the latter is cheap as it can be acquired by two unsupervised robots moving randomly and toggling their LEDs while sharing the true LED states via radio. Training with the latter dataset on estimating the LEDs' state of the peer robot (\emph{pretext task}) promotes learning the relative localization task (\emph{end task}). Experiments on real-world data acquired by two autonomous wheeled robots show that a model trained only on the pretext task successfully learns to localize a peer robot on the image plane; fine-tuning such model on the end task with few labeled images yields statistically significant improvements in 6D relative pose estimation with respect to baselines that do not use pretext-task pre-training, and alternative approaches. Estimating the state of multiple independent LEDs promotes learning to estimate relative heading. The approach works even when a large fraction of training images do not include the peer robot and generalizes well to unseen environments.
comment: Accepted at International Conference on Intelligent Robots and Systems (IROS) 2024
Visual collective behaviors on spherical robots
The implementation of collective motion, traditionally, disregard the limited sensing capabilities of an individual, to instead assuming an omniscient perception of the environment. This study implements a visual flocking model in a ``robot-in-the-loop'' approach to reproduce these behaviors with a flock composed of 10 independent spherical robots. The model achieves robotic collective motion by only using panoramic visual information of each robot, such as retinal position, optical size and optic flow of the neighboring robots. We introduce a virtual anchor to confine the collective robotic movements so to avoid wall interactions. For the first time, a simple visual robot-in-the-loop approach succeed in reproducing several collective motion phases, in particular, swarming, and milling. Another milestone achieved with by this model is bridging the gap between simulation and physical experiments by demonstrating nearly identical behaviors in both environments with the same visual model. To conclude, we show that our minimal visual collective motion model is sufficient to recreate most collective behaviors on a robot-in-the-loop system that is scalable, behaves as numerical simulations predict and is easily comparable to traditional models.
comment: 26 pages, 16 figures, journal bioinspired and biomimetics
SCANet: Correcting LEGO Assembly Errors with Self-Correct Assembly Network
Autonomous assembly in robotics and 3D vision presents significant challenges, particularly in ensuring assembly correctness. Presently, predominant methods such as MEPNet focus on assembling components based on manually provided images. However, these approaches often fall short in achieving satisfactory results for tasks requiring long-term planning. Concurrently, we observe that integrating a self-correction module can partially alleviate such issues. Motivated by this concern, we introduce the single-step assembly error correction task, which involves identifying and rectifying misassembled components. To support research in this area, we present the LEGO Error Correction Assembly Dataset (LEGO-ECA), comprising manual images for assembly steps and instances of assembly failures. Additionally, we propose the Self-Correct Assembly Network (SCANet), a novel method to address this task. SCANet treats assembled components as queries, determining their correctness in manual images and providing corrections when necessary. Finally, we utilize SCANet to correct the assembly results of MEPNet. Experimental results demonstrate that SCANet can identify and correct MEPNet's misassembled results, significantly improving the correctness of assembly. Our code and dataset are available at https://github.com/Yaser-wyx/SCANet.
Comparative Evaluation of Learning Models for Bionic Robots: Non-Linear Transfer Function Identifications
The control and modeling of robot dynamics have increasingly adopted model-free control strategies using machine learning. Given the non-linear elastic nature of bionic robotic systems, learning-based methods provide reliable alternatives by utilizing numerical data to establish a direct mapping from actuation inputs to robot trajectories without complex kinematics models. However, for developers, the method of identifying an appropriate learning model for their specific bionic robots and further constructing the transfer function has not been thoroughly discussed. Thus, this research introduces a comprehensive evaluation strategy and framework for the application of model-free control, including data collection, learning model selection, comparative analysis, and transfer function identification to effectively deal with the multi-input multi-output (MIMO) robotic data.
comment: 12 pages, 21 figures, 1 table
Rethinking 6-Dof Grasp Detection: A Flexible Framework for High-Quality Grasping
Robotic grasping is a primitive skill for complex tasks and is fundamental to intelligence. For general 6-Dof grasping, most previous methods directly extract scene-level semantic or geometric information, while few of them consider the suitability for various downstream applications, such as target-oriented grasping. Addressing this issue, we rethink 6-Dof grasp detection from a grasp-centric view and propose a versatile grasp framework capable of handling both scene-level and target-oriented grasping. Our framework, FlexLoG, is composed of a Flexible Guidance Module and a Local Grasp Model. Specifically, the Flexible Guidance Module is compatible with both global (e.g., grasp heatmap) and local (e.g., visual grounding) guidance, enabling the generation of high-quality grasps across various tasks. The Local Grasp Model focuses on object-agnostic regional points and predicts grasps locally and intently. Experiment results reveal that our framework achieves over 18% and 23% improvement on unseen splits of the GraspNet-1Billion Dataset. Furthermore, real-world robotic tests in three distinct settings yield a 95% success rate.
comment: 8 pages, 8 figures
A Multimedia Framework for Continuum Robots: Systematic, Computational, and Control Perspectives
Continuum robots, which often rely on interdisciplinary and multimedia collaborations, have been increasingly recognized for their potential to revolutionize the field of human-computer interaction (HCI) in varied applications due to their adaptive, responsive, and flexible characteristics. Despite their promises, the lack of an integrated framework poses a significant limitation for both users and developers, resulting in inefficiency and complexity during preliminary developments. Thus, this paper introduces a unified framework for continuum robotic systems that addresses these challenges by integrating system architecture, dynamics computation, and control strategy within a computer-aided design (CAD) platform. The proposed method allows for efficient modeling and quick preview of the robot performance, and thus facilitating iterative design and implementation, with a view to enhancing the quality of robot developments.
comment: 9 pages, 10 figures, 1 table
Deep Learning Innovations for Underwater Waste Detection: An In-Depth Analysis
Addressing the issue of submerged underwater trash is crucial for safeguarding aquatic ecosystems and preserving marine life. While identifying debris present on the surface of water bodies is straightforward, assessing the underwater submerged waste is a challenge due to the image distortions caused by factors such as light refraction, absorption, suspended particles, color shifts, and occlusion. This paper conducts a comprehensive review of state-of-the-art architectures and on the existing datasets to establish a baseline for submerged waste and trash detection. The primary goal remains to establish the benchmark of the object localization techniques to be leveraged by advanced underwater sensors and autonomous underwater vehicles. The ultimate objective is to explore the underwater environment, to identify, and remove underwater debris. The absence of benchmarks (dataset or algorithm) in many researches emphasizes the need for a more robust algorithmic solution. Through this research, we aim to give performance comparative analysis of various underwater trash detection algorithms.
Simplex-enabled Safe Continual Learning Machine
This paper proposes the SeC-Learning Machine: Simplex-enabled safe continual learning for safety-critical autonomous systems. The SeC-learning machine is built on Simplex logic (that is, ``using simplicity to control complexity'') and physics-regulated deep reinforcement learning (Phy-DRL). The SeC-learning machine thus constitutes HP (high performance)-Student, HA (high assurance)-Teacher, and Coordinator. Specifically, the HP-Student is a pre-trained high-performance but not fully verified Phy-DRL, continuing to learn in a real plant to tune the action policy to be safe. In contrast, the HA-Teacher is a mission-reduced, physics-model-based, and verified design. As a complementary, HA-Teacher has two missions: backing up safety and correcting unsafe learning. The Coordinator triggers the interaction and the switch between HP-Student and HA-Teacher. Powered by the three interactive components, the SeC-learning machine can i) assure lifetime safety (i.e., safety guarantee in any continual-learning stage, regardless of HP-Student's success or convergence), ii) address the Sim2Real gap, and iii) learn to tolerate unknown unknowns in real plants. The experiments on a cart-pole system and a real quadruped robot demonstrate the distinguished features of the SeC-learning machine, compared with continual learning built on state-of-the-art safe DRL frameworks with approaches to addressing the Sim2Real gap.
VoxAct-B: Voxel-Based Acting and Stabilizing Policy for Bimanual Manipulation
Bimanual manipulation is critical to many robotics applications. In contrast to single-arm manipulation, bimanual manipulation tasks are challenging due to higher-dimensional action spaces. Prior works leverage large amounts of data and primitive actions to address this problem, but may suffer from sample inefficiency and limited generalization across various tasks. To this end, we propose VoxAct-B, a language-conditioned, voxel-based method that leverages Vision Language Models (VLMs) to prioritize key regions within the scene and reconstruct a voxel grid. We provide this voxel grid to our bimanual manipulation policy to learn acting and stabilizing actions. This approach enables more efficient policy learning from voxels and is generalizable to different tasks. In simulation, we show that VoxAct-B outperforms strong baselines on fine-grained bimanual manipulation tasks. Furthermore, we demonstrate VoxAct-B on real-world $\texttt{Open Drawer}$ and $\texttt{Open Jar}$ tasks using two UR5s. Code, data, and videos are available at https://voxact-b.github.io.
comment: Accepted to the Conference on Robot Learning (CoRL) 2024
Multiagent Systems
The Role of Social Support and Influencers in Social Media Communities
How can individual agents coordinate their actions to achieve a shared objective in distributed systems? This challenge spans economic, technical, and sociological domains, each confronting scalability, heterogeneity, and conflicts between individual and collective goals. In economic markets, a common currency facilitates coordination, raising the question of whether such mechanisms can be applied in other contexts. This paper explores this idea within social media platforms, where social support (likes, shares, comments) acts as a currency that shapes content production and sharing. We investigate two key questions: (1) Can social support serve as an effective coordination tool, and (2) What role do influencers play in content creation and dissemination? Our formal analysis shows that social support can coordinate user actions similarly to money in economic markets. Influencers serve dual roles, aggregating content and acting as information proxies, guiding content producers in large markets. While imperfections in information lead to a "price of influence" and suboptimal outcomes, this price diminishes as markets grow, improving social welfare. These insights provide a framework for understanding coordination in distributed environments, with applications in both sociological systems and multi-agent AI systems.
Distributed Detection of Adversarial Attacks for Resilient Cooperation of Multi-Robot Systems with Intermittent Communication
This paper concerns the consensus and formation of a network of mobile autonomous agents in adversarial settings where a group of malicious (compromised) agents are subject to deception attacks. In addition, the communication network is arbitrarily time-varying and subject to intermittent connections, possibly imposed by denial-of-service (DoS) attacks. We provide explicit bounds for network connectivity in an integral sense, enabling the characterization of the system's resilience to specific classes of adversarial attacks. We also show that under the condition of connectivity in an integral sense uniformly in time, the system is finite-gain $\mathcal{L}_{p}$ stable and uniformly exponentially fast consensus and formation are achievable, provided malicious agents are detected and isolated from the network. We present a distributed and reconfigurable framework with theoretical guarantees for detecting malicious agents, allowing for the resilient cooperation of the remaining cooperative agents. Simulation studies are provided to illustrate the theoretical findings.
comment: to be published in IEEE
Exploring the Potential of Conversational Test Suite Based Program Repair on SWE-bench
Automatic program repair at project level may open yet to be seen opportunities in various fields of human activity. Since the SWE-Bench challenge was presented, we have seen numerous of solutions. Patch generation is a part of program repair, and test suite-based conversational patch generation has proven its effectiveness. However, the potential of conversational patch generation has not yet specifically estimated on SWE-Bench. This study reports experimental results aimed at evaluating the individual effectiveness of conversational patch generation on problems from SWE-Bench. The experiments show that a simple conversational pipeline based on LLaMA 3.1 70B can generate valid patches in 47\% of cases, which is comparable to the state-of-the-art in program repair on SWE-Bench.
comment: 3 pages, 2 figures, 1 algorithm, appendix
GenSim: A General Social Simulation Platform with Large Language Model based Agents
With the rapid advancement of large language models (LLMs), recent years have witnessed many promising studies on leveraging LLM-based agents to simulate human social behavior. While prior work has demonstrated significant potential across various domains, much of it has focused on specific scenarios involving a limited number of agents and has lacked the ability to adapt when errors occur during simulation. To overcome these limitations, we propose a novel LLM-agent-based simulation platform called \textit{GenSim}, which: (1) \textbf{Abstracts a set of general functions} to simplify the simulation of customized social scenarios; (2) \textbf{Supports one hundred thousand agents} to better simulate large-scale populations in real-world contexts; (3) \textbf{Incorporates error-correction mechanisms} to ensure more reliable and long-term simulations. To evaluate our platform, we assess both the efficiency of large-scale agent simulations and the effectiveness of the error-correction mechanisms. To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform based on LLM agents, promising to further advance the field of social science.
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
Agents based on large language models have shown great potential in accelerating scientific discovery by leveraging their rich background knowledge and reasoning capabilities. In this paper, we introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions. We demonstrate our agent on the problem of designing genetic perturbation experiments, where the aim is to find a small subset out of many possible genes that, when perturbed, result in a specific phenotype (e.g., cell growth). Utilizing its biological knowledge, BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model or explicitly design an acquisition function as in Bayesian optimization. Moreover, BioDiscoveryAgent, using Claude 3.5 Sonnet, achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets, and a 46% improvement in the harder task of non-essential gene perturbation, compared to existing Bayesian optimization baselines specifically trained for this task. Our evaluation includes one dataset that is unpublished, ensuring it is not part of the language model's training data. Additionally, BioDiscoveryAgent predicts gene combinations to perturb more than twice as accurately as a random baseline, a task so far not explored in the context of closed-loop experiment design. The agent also has access to tools for searching the biomedical literature, executing code to analyze biological datasets, and prompting another agent to critically evaluate its predictions. Overall, BioDiscoveryAgent is interpretable at every stage, representing an accessible new paradigm in the computational design of biological experiments with the potential to augment scientists' efficacy.
MARLadona -- Towards Cooperative Team Play Using Multi-Agent Reinforcement Learning
Robot soccer, in its full complexity, poses an unsolved research challenge. Current solutions heavily rely on engineered heuristic strategies, which lack robustness and adaptability. Deep reinforcement learning has gained significant traction in various complex robotics tasks such as locomotion, manipulation, and competitive games (e.g., AlphaZero, OpenAI Five), making it a promising solution to the robot soccer problem. This paper introduces MARLadona. A decentralized multi-agent reinforcement learning (MARL) training pipeline capable of producing agents with sophisticated team play behavior, bridging the shortcomings of heuristic methods. Further, we created an open-source multi-agent soccer environment based on Isaac Gym. Utilizing our MARL framework and a modified a global entity encoder as our core architecture, our approach achieves a 66.8% win rate against HELIOS agent, which employs a state-of-the-art heuristic strategy. Furthermore, we provided an in-depth analysis of the policy behavior and interpreted the agent's intention using the critic network.
comment: Fixed redundant "-" in the title
Systems and Control (CS)
Distributed ADMM Approach for the Power Distribution Network Reconfiguration
The electrical network reconfiguration problem aims to minimize losses in a distribution system by adjusting switches while ensuring radial topology. The growing use of renewable energy and the complexity of managing modern power grids make solving the reconfiguration problem crucial. Distributed algorithms help optimize grid configurations, ensuring efficient adaptation to changing conditions and better utilization of renewable energy sources. This paper introduces a distributed algorithm designed to tackle the problem of power distribution network reconfiguration with a radiality constraint. This algorithm relies on ADMM (Alternating Direction Method of Multipliers), where each agent progressively updates its estimation based on the information exchanged with neighboring agents. We show that every agent is required to solve a linearly constrained convex quadratic programming problem and a Minimum Weight Rooted Arborescence Problem (MWRAP) with local weights during each iteration. Through numerical experiments, we demonstrate the performance of the proposed algorithm in various scenarios, including its application to a 33-bus test system and a real-world network.
Bisimulation metric for Model Predictive Control
Model-based reinforcement learning has shown promise for improving sample efficiency and decision-making in complex environments. However, existing methods face challenges in training stability, robustness to noise, and computational efficiency. In this paper, we propose Bisimulation Metric for Model Predictive Control (BS-MPC), a novel approach that incorporates bisimulation metric loss in its objective function to directly optimize the encoder. This time-step-wise direct optimization enables the learned encoder to extract intrinsic information from the original state space while discarding irrelevant details and preventing the gradients and errors from diverging. BS-MPC improves training stability, robustness against input noise, and computational efficiency by reducing training time. We evaluate BS-MPC on both continuous control and image-based tasks from the DeepMind Control Suite, demonstrating superior performance and robustness compared to state-of-the-art baseline methods.
Distributed Detection of Adversarial Attacks for Resilient Cooperation of Multi-Robot Systems with Intermittent Communication
This paper concerns the consensus and formation of a network of mobile autonomous agents in adversarial settings where a group of malicious (compromised) agents are subject to deception attacks. In addition, the communication network is arbitrarily time-varying and subject to intermittent connections, possibly imposed by denial-of-service (DoS) attacks. We provide explicit bounds for network connectivity in an integral sense, enabling the characterization of the system's resilience to specific classes of adversarial attacks. We also show that under the condition of connectivity in an integral sense uniformly in time, the system is finite-gain $\mathcal{L}_{p}$ stable and uniformly exponentially fast consensus and formation are achievable, provided malicious agents are detected and isolated from the network. We present a distributed and reconfigurable framework with theoretical guarantees for detecting malicious agents, allowing for the resilient cooperation of the remaining cooperative agents. Simulation studies are provided to illustrate the theoretical findings.
comment: to be published in IEEE
Distribution Grids May Be a Barrier To Residential Electrification
Replacing fossil-fueled appliances and vehicles with electric alternatives can reduce greenhouse gas emissions and air pollution in many settings. However, residential electrification can raise electricity demand beyond the safe limits of electrical infrastructure, increasing the risk of blackouts or requiring grid reinforcement that can be slow and expensive. Here, we estimate the physical and economic impacts on distribution grids of electrifying all housing and personal vehicles in each county of the lower 48 United States. We find that space heating is the main driver of grid impacts, with the coldest regions seeing demand peaks up to three times higher than today's peaks. Accommodating electrification of all housing and personal vehicles could require up to 312 GW of distribution grid reinforcement nationally, at a cost of $183 to $415 billion, or $1,500 to $3,400 per household (95% confidence intervals). However, demand-side management can mitigate demand peaks, reducing grid reinforcement costs by up to 92%.
A Reinforcement Learning Engine with Reduced Action and State Space for Scalable Cyber-Physical Optimal Response
Numerous research studies have been conducted to enhance the resilience of cyber-physical systems (CPSs) by detecting potential cyber or physical disturbances. However, the development of scalable and optimal response measures under power system contingency based on fusing cyber-physical data is still in an early stage. To address this research gap, this paper introduces a power system response engine based on reinforcement learning (RL) and role and interaction discovery (RID) techniques. RL-RID-GridResponder is designed to automatically detect the contingency and assist with the decision-making process to ensure optimal power system operation. The RL-RID-GridResponder learns via an RL-based structure and achieves enhanced scalability by integrating an RID module with reduced action and state spaces. The applicability of RL-RID-GridResponder in providing scalable and optimal responses for CPSs is demonstrated on power systems in the context of Denial of Service (DoS) attacks. Moreover, simulations are conducted on a Volt-Var regulation problem using the augmented WSCC 9-bus and augmented IEEE 24-bus systems based on fused cyber and physical data sets. The results show that the proposed RL-RID-GridResponder can provide fast and accurate responses to ensure optimal power system operation under DoS and can extend to other system contingencies such as line outages and loss of loads.
Multi-Attribute Auctions for Efficient Operation of Non-Cooperative Relaying Systems
This paper studies the use of a multi-attribute auction in a communication system to bring about efficient relaying in a non-cooperative setting. We consider a system where a source seeks to offload data to an access point (AP) while balancing both the timeliness and energy-efficiency of the transmission. A deep fade in the communication channel (due to, e.g., a line-of-sight blockage) makes direct communication costly, and the source may alternatively rely on non-cooperative UEs to act as relays. We propose a multi-attribute auction to select a UE and to determine the duration and power of the transmission, with payments to the UE taking the form of energy sent via wireless power transfer (WPT). The quality of the channel from a UE to the AP constitutes private information, and bids consist of a transmission time and transmission power. We show that under a second-preferred-offer auction, truthful bidding by all candidate UEs forms a Nash Equilibrium. However, this auction is not incentive compatible, and we present a modified auction in which truthful bidding is in fact a dominant strategy. Extensive numerical experimentation illustrates the efficacy of our approach, which we compare to a cooperative baseline. We demonstrate that with as few as two candidates, our improved mechanism leads to as much as a 76% reduction in energy consumption, and that with as few as three candidates, the transmission time decreases by as much as 55\%. Further, we see that as the number of candidates increases, the performance of our mechanism approaches that of the cooperative baseline. Overall, our findings highlight the potential of multi-attribute auctions to enhance the efficiency of data transfer in non-cooperative settings.
Data-driven Under Frequency Load Shedding Using Reinforcement Learning
Underfrequency load shedding (UFLS) is a critical control strategy in power systems aimed at maintaining system stability and preventing blackouts during severe frequency drops. Traditional UFLS schemes often rely on predefined rules and thresholds, which may not adapt effectively to the dynamic and complex nature of modern power grids. Reinforcement learning (RL) methods have been proposed to effectively handle the UFLS problem. However, training these RL agents is computationally burdensome due to solving multiple differential equations at each step of training. This computational burden also limits the effectiveness of the RL agents for use in real-time. To reduce the computational burden, a machine learning (ML) classifier is trained to capture the frequency response of the system to various disturbances. The RL agent is then trained using the classifier, thus avoiding multiple computations during each step of agent training. Key features of this approach include reduced training time, as well as faster real-time application compared to other RL agents, and its potential to improve system resilience by minimizing the amount of load shed while effectively stabilizing the frequency. Comparative studies with conventional UFLS schemes demonstrate that the RL-based strategy achieves superior performance while significantly reducing the time required. Simulation results on the IEEE 68-bus system validate the performance of the proposed RL method.
GreenLight-Gym: A Reinforcement Learning Benchmark Environment for Greenhouse Crop Production Control
Controlling greenhouse crop production systems is a complex task due to uncertain and non-linear dynamics between crops, indoor and outdoor climate, and economics. The declining number of skilled growers necessitates the development of autonomous greenhouse control systems. Reinforcement Learning (RL) is a promising approach that can learn a control policy to automate greenhouse management. RL optimises a control policy through interactions with a model of the greenhouse while guided by an economic-based reward function. However, its application to real-world systems is limited due to discrepancies between models and real-world dynamics. Moreover, RL controllers may struggle to maintain state constraints while optimising the primary objective, especially when models inadequately capture the adverse effects of constraint violations on crop growth. Also, the generalisation to novel states, for example, due to unseen weather trajectories, is underexplored in RL-based greenhouse control. This work addresses these challenges through three key contributions. First, we present GreenLight-Gym, the first open-source environment designed for training and evaluating RL algorithms on the state-of-the-art greenhouse model GreenLight. GreenLight-Gym enables the community to benchmark RL-based control methodologies. Second, we compare two reward-shaping approaches, using either a multiplicative or additive penalty, to enforce state boundaries. The additive penalty achieves more stable training while better adhering to state constraints, while the multiplicative penalty yields marginally higher profits. Finally, we evaluate RL performance on a disjoint training and testing weather dataset, demonstrating improved generalisation to unseen conditions. Our environment and experiment scripts are open-sourced, facilitating innovative research on learning-based greenhouse control.
Research on Enhancing C-V2X Communication via Danger-Aware Vehicular Networking
This paper presents a protocol that optimizes message dissemination in C-V2X technology, crucial for advancing intelligent transportation systems (ITS) aimed at enhancing road safety. As vehicle density and velocity rise, the volume of data requiring communication significantly increases. By considering the risk levels that vehicles encounter and using inter-vehicle proximity as a key indicator of potential hazards, the proposed protocol prioritizes communication, allowing vehicles facing higher risks to transmit their messages first. Our results show that this prioritization effectively reduces the number of concurrent transmissions, leading to improved performance metrics such as packet delivery ratio, throughput, latency, and lower probabilities of channel congestion and collision.
Brain-Like Replay Naturally Emerges in Reinforcement Learning Agents
Replay is a powerful strategy to promote learning in artificial intelligence and the brain. However, the conditions to generate it and its functional advantages have not been fully recognized. In this study, we develop a modular reinforcement learning model that could generate replay. We prove that replay generated in this way helps complete the task. We also analyze the information contained in the representation and provide a mechanism for how replay makes a difference. Our design avoids complex assumptions and enables replay to emerge naturally within a task-optimized paradigm. Our model also reproduces key phenomena observed in biological agents. This research explores the structural biases in modular ANN to generate replay and its potential utility in developing efficient RL.
Visual collective behaviors on spherical robots
The implementation of collective motion, traditionally, disregard the limited sensing capabilities of an individual, to instead assuming an omniscient perception of the environment. This study implements a visual flocking model in a ``robot-in-the-loop'' approach to reproduce these behaviors with a flock composed of 10 independent spherical robots. The model achieves robotic collective motion by only using panoramic visual information of each robot, such as retinal position, optical size and optic flow of the neighboring robots. We introduce a virtual anchor to confine the collective robotic movements so to avoid wall interactions. For the first time, a simple visual robot-in-the-loop approach succeed in reproducing several collective motion phases, in particular, swarming, and milling. Another milestone achieved with by this model is bridging the gap between simulation and physical experiments by demonstrating nearly identical behaviors in both environments with the same visual model. To conclude, we show that our minimal visual collective motion model is sufficient to recreate most collective behaviors on a robot-in-the-loop system that is scalable, behaves as numerical simulations predict and is easily comparable to traditional models.
comment: 26 pages, 16 figures, journal bioinspired and biomimetics
Feedback-feedforward Signal Control with Exogenous Demand Estimation in Congested Urban Road Networks
To cope with uncertain traffic patterns and traffic models, traffic-responsive signal control strategies in the literature are designed to be robust to these uncertainties. These robust strategies still require sensing infrastructure to implement traffic-responsiveness. In this paper, we take a novel perspective and show that it is possible to use the already necessary sensing infrastructure to estimate the uncertain quantities in real time. Specifically, resorting to the store-and-forward model, we design a novel network-wide traffic-responsive strategy that estimates the occupancy and exogenous demand in each link, i.e., entering (exiting) vehicle flows at the origins (destinations) of the network or within links, in real time. Borrowing from optimal control theory, we design an optimal linear quadratic control scheme, consisting of a linear feedback term, of the occupancy of the road links, and a feedforward component, which accounts for the varying exogenous vehicle load on the network. Thereby, the resulting control scheme is a simple feedback-feedforward controller, which is fed with occupancy and exogenous demand estimates, and is suitable for real-time implementation. Numerical simulations for the urban traffic network of Chania, Greece, show that, for realistic surges in the exogenous demand, the proposed solution significantly outperforms tried-and-tested solutions that ignore the exogenous demand.
Distributed Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments
This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers an optimal solution with a lower cost with respect to conventional Voronoi-based techniques by effectively handling the issue of agents remaining stationary in regions void of information using a ranking function. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms.
Comparative Evaluation of Learning Models for Bionic Robots: Non-Linear Transfer Function Identifications
The control and modeling of robot dynamics have increasingly adopted model-free control strategies using machine learning. Given the non-linear elastic nature of bionic robotic systems, learning-based methods provide reliable alternatives by utilizing numerical data to establish a direct mapping from actuation inputs to robot trajectories without complex kinematics models. However, for developers, the method of identifying an appropriate learning model for their specific bionic robots and further constructing the transfer function has not been thoroughly discussed. Thus, this research introduces a comprehensive evaluation strategy and framework for the application of model-free control, including data collection, learning model selection, comparative analysis, and transfer function identification to effectively deal with the multi-input multi-output (MIMO) robotic data.
comment: 12 pages, 21 figures, 1 table
Decentralized Robust Data-driven Predictive Control for Smoothing Mixed Traffic Flow
In a mixed traffic with connected automated vehicles (CAVs) and human-driven vehicles (HDVs) coexisting, data-driven predictive control of CAVs promises system-wide traffic performance improvements. Yet, most existing approaches focus on a centralized setup, which is not computationally scalable while failing to protect data privacy. The robustness against unknown disturbances has not been well addressed either, causing safety concerns. In this paper, we propose a decentralized robust DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control) approach for CAVs to smooth mixed traffic flow. In particular, each CAV computes its control input based on locally available data from its involved subsystem. Meanwhile, the interaction between neighboring subsystems is modeled as a bounded disturbance, for which appropriate estimation methods are proposed. Then, we formulate a robust optimization problem and present its tractable computational solutions. Compared with the centralized formulation, our method greatly reduces computation burden with better safety performance, while naturally preserving data privacy. Extensive traffic simulations validate its wave-dampening ability, safety performance, and computational benefits.
Systems and Control (EESS)
Distributed ADMM Approach for the Power Distribution Network Reconfiguration
The electrical network reconfiguration problem aims to minimize losses in a distribution system by adjusting switches while ensuring radial topology. The growing use of renewable energy and the complexity of managing modern power grids make solving the reconfiguration problem crucial. Distributed algorithms help optimize grid configurations, ensuring efficient adaptation to changing conditions and better utilization of renewable energy sources. This paper introduces a distributed algorithm designed to tackle the problem of power distribution network reconfiguration with a radiality constraint. This algorithm relies on ADMM (Alternating Direction Method of Multipliers), where each agent progressively updates its estimation based on the information exchanged with neighboring agents. We show that every agent is required to solve a linearly constrained convex quadratic programming problem and a Minimum Weight Rooted Arborescence Problem (MWRAP) with local weights during each iteration. Through numerical experiments, we demonstrate the performance of the proposed algorithm in various scenarios, including its application to a 33-bus test system and a real-world network.
Bisimulation metric for Model Predictive Control
Model-based reinforcement learning has shown promise for improving sample efficiency and decision-making in complex environments. However, existing methods face challenges in training stability, robustness to noise, and computational efficiency. In this paper, we propose Bisimulation Metric for Model Predictive Control (BS-MPC), a novel approach that incorporates bisimulation metric loss in its objective function to directly optimize the encoder. This time-step-wise direct optimization enables the learned encoder to extract intrinsic information from the original state space while discarding irrelevant details and preventing the gradients and errors from diverging. BS-MPC improves training stability, robustness against input noise, and computational efficiency by reducing training time. We evaluate BS-MPC on both continuous control and image-based tasks from the DeepMind Control Suite, demonstrating superior performance and robustness compared to state-of-the-art baseline methods.
Distributed Detection of Adversarial Attacks for Resilient Cooperation of Multi-Robot Systems with Intermittent Communication
This paper concerns the consensus and formation of a network of mobile autonomous agents in adversarial settings where a group of malicious (compromised) agents are subject to deception attacks. In addition, the communication network is arbitrarily time-varying and subject to intermittent connections, possibly imposed by denial-of-service (DoS) attacks. We provide explicit bounds for network connectivity in an integral sense, enabling the characterization of the system's resilience to specific classes of adversarial attacks. We also show that under the condition of connectivity in an integral sense uniformly in time, the system is finite-gain $\mathcal{L}_{p}$ stable and uniformly exponentially fast consensus and formation are achievable, provided malicious agents are detected and isolated from the network. We present a distributed and reconfigurable framework with theoretical guarantees for detecting malicious agents, allowing for the resilient cooperation of the remaining cooperative agents. Simulation studies are provided to illustrate the theoretical findings.
comment: to be published in IEEE
Distribution Grids May Be a Barrier To Residential Electrification
Replacing fossil-fueled appliances and vehicles with electric alternatives can reduce greenhouse gas emissions and air pollution in many settings. However, residential electrification can raise electricity demand beyond the safe limits of electrical infrastructure, increasing the risk of blackouts or requiring grid reinforcement that can be slow and expensive. Here, we estimate the physical and economic impacts on distribution grids of electrifying all housing and personal vehicles in each county of the lower 48 United States. We find that space heating is the main driver of grid impacts, with the coldest regions seeing demand peaks up to three times higher than today's peaks. Accommodating electrification of all housing and personal vehicles could require up to 312 GW of distribution grid reinforcement nationally, at a cost of $183 to $415 billion, or $1,500 to $3,400 per household (95% confidence intervals). However, demand-side management can mitigate demand peaks, reducing grid reinforcement costs by up to 92%.
A Reinforcement Learning Engine with Reduced Action and State Space for Scalable Cyber-Physical Optimal Response
Numerous research studies have been conducted to enhance the resilience of cyber-physical systems (CPSs) by detecting potential cyber or physical disturbances. However, the development of scalable and optimal response measures under power system contingency based on fusing cyber-physical data is still in an early stage. To address this research gap, this paper introduces a power system response engine based on reinforcement learning (RL) and role and interaction discovery (RID) techniques. RL-RID-GridResponder is designed to automatically detect the contingency and assist with the decision-making process to ensure optimal power system operation. The RL-RID-GridResponder learns via an RL-based structure and achieves enhanced scalability by integrating an RID module with reduced action and state spaces. The applicability of RL-RID-GridResponder in providing scalable and optimal responses for CPSs is demonstrated on power systems in the context of Denial of Service (DoS) attacks. Moreover, simulations are conducted on a Volt-Var regulation problem using the augmented WSCC 9-bus and augmented IEEE 24-bus systems based on fused cyber and physical data sets. The results show that the proposed RL-RID-GridResponder can provide fast and accurate responses to ensure optimal power system operation under DoS and can extend to other system contingencies such as line outages and loss of loads.
Multi-Attribute Auctions for Efficient Operation of Non-Cooperative Relaying Systems
This paper studies the use of a multi-attribute auction in a communication system to bring about efficient relaying in a non-cooperative setting. We consider a system where a source seeks to offload data to an access point (AP) while balancing both the timeliness and energy-efficiency of the transmission. A deep fade in the communication channel (due to, e.g., a line-of-sight blockage) makes direct communication costly, and the source may alternatively rely on non-cooperative UEs to act as relays. We propose a multi-attribute auction to select a UE and to determine the duration and power of the transmission, with payments to the UE taking the form of energy sent via wireless power transfer (WPT). The quality of the channel from a UE to the AP constitutes private information, and bids consist of a transmission time and transmission power. We show that under a second-preferred-offer auction, truthful bidding by all candidate UEs forms a Nash Equilibrium. However, this auction is not incentive compatible, and we present a modified auction in which truthful bidding is in fact a dominant strategy. Extensive numerical experimentation illustrates the efficacy of our approach, which we compare to a cooperative baseline. We demonstrate that with as few as two candidates, our improved mechanism leads to as much as a 76% reduction in energy consumption, and that with as few as three candidates, the transmission time decreases by as much as 55\%. Further, we see that as the number of candidates increases, the performance of our mechanism approaches that of the cooperative baseline. Overall, our findings highlight the potential of multi-attribute auctions to enhance the efficiency of data transfer in non-cooperative settings.
Data-driven Under Frequency Load Shedding Using Reinforcement Learning
Underfrequency load shedding (UFLS) is a critical control strategy in power systems aimed at maintaining system stability and preventing blackouts during severe frequency drops. Traditional UFLS schemes often rely on predefined rules and thresholds, which may not adapt effectively to the dynamic and complex nature of modern power grids. Reinforcement learning (RL) methods have been proposed to effectively handle the UFLS problem. However, training these RL agents is computationally burdensome due to solving multiple differential equations at each step of training. This computational burden also limits the effectiveness of the RL agents for use in real-time. To reduce the computational burden, a machine learning (ML) classifier is trained to capture the frequency response of the system to various disturbances. The RL agent is then trained using the classifier, thus avoiding multiple computations during each step of agent training. Key features of this approach include reduced training time, as well as faster real-time application compared to other RL agents, and its potential to improve system resilience by minimizing the amount of load shed while effectively stabilizing the frequency. Comparative studies with conventional UFLS schemes demonstrate that the RL-based strategy achieves superior performance while significantly reducing the time required. Simulation results on the IEEE 68-bus system validate the performance of the proposed RL method.
GreenLight-Gym: A Reinforcement Learning Benchmark Environment for Greenhouse Crop Production Control
Controlling greenhouse crop production systems is a complex task due to uncertain and non-linear dynamics between crops, indoor and outdoor climate, and economics. The declining number of skilled growers necessitates the development of autonomous greenhouse control systems. Reinforcement Learning (RL) is a promising approach that can learn a control policy to automate greenhouse management. RL optimises a control policy through interactions with a model of the greenhouse while guided by an economic-based reward function. However, its application to real-world systems is limited due to discrepancies between models and real-world dynamics. Moreover, RL controllers may struggle to maintain state constraints while optimising the primary objective, especially when models inadequately capture the adverse effects of constraint violations on crop growth. Also, the generalisation to novel states, for example, due to unseen weather trajectories, is underexplored in RL-based greenhouse control. This work addresses these challenges through three key contributions. First, we present GreenLight-Gym, the first open-source environment designed for training and evaluating RL algorithms on the state-of-the-art greenhouse model GreenLight. GreenLight-Gym enables the community to benchmark RL-based control methodologies. Second, we compare two reward-shaping approaches, using either a multiplicative or additive penalty, to enforce state boundaries. The additive penalty achieves more stable training while better adhering to state constraints, while the multiplicative penalty yields marginally higher profits. Finally, we evaluate RL performance on a disjoint training and testing weather dataset, demonstrating improved generalisation to unseen conditions. Our environment and experiment scripts are open-sourced, facilitating innovative research on learning-based greenhouse control.
Research on Enhancing C-V2X Communication via Danger-Aware Vehicular Networking
This paper presents a protocol that optimizes message dissemination in C-V2X technology, crucial for advancing intelligent transportation systems (ITS) aimed at enhancing road safety. As vehicle density and velocity rise, the volume of data requiring communication significantly increases. By considering the risk levels that vehicles encounter and using inter-vehicle proximity as a key indicator of potential hazards, the proposed protocol prioritizes communication, allowing vehicles facing higher risks to transmit their messages first. Our results show that this prioritization effectively reduces the number of concurrent transmissions, leading to improved performance metrics such as packet delivery ratio, throughput, latency, and lower probabilities of channel congestion and collision.
Brain-Like Replay Naturally Emerges in Reinforcement Learning Agents
Replay is a powerful strategy to promote learning in artificial intelligence and the brain. However, the conditions to generate it and its functional advantages have not been fully recognized. In this study, we develop a modular reinforcement learning model that could generate replay. We prove that replay generated in this way helps complete the task. We also analyze the information contained in the representation and provide a mechanism for how replay makes a difference. Our design avoids complex assumptions and enables replay to emerge naturally within a task-optimized paradigm. Our model also reproduces key phenomena observed in biological agents. This research explores the structural biases in modular ANN to generate replay and its potential utility in developing efficient RL.
Visual collective behaviors on spherical robots
The implementation of collective motion, traditionally, disregard the limited sensing capabilities of an individual, to instead assuming an omniscient perception of the environment. This study implements a visual flocking model in a ``robot-in-the-loop'' approach to reproduce these behaviors with a flock composed of 10 independent spherical robots. The model achieves robotic collective motion by only using panoramic visual information of each robot, such as retinal position, optical size and optic flow of the neighboring robots. We introduce a virtual anchor to confine the collective robotic movements so to avoid wall interactions. For the first time, a simple visual robot-in-the-loop approach succeed in reproducing several collective motion phases, in particular, swarming, and milling. Another milestone achieved with by this model is bridging the gap between simulation and physical experiments by demonstrating nearly identical behaviors in both environments with the same visual model. To conclude, we show that our minimal visual collective motion model is sufficient to recreate most collective behaviors on a robot-in-the-loop system that is scalable, behaves as numerical simulations predict and is easily comparable to traditional models.
comment: 26 pages, 16 figures, journal bioinspired and biomimetics
Feedback-feedforward Signal Control with Exogenous Demand Estimation in Congested Urban Road Networks
To cope with uncertain traffic patterns and traffic models, traffic-responsive signal control strategies in the literature are designed to be robust to these uncertainties. These robust strategies still require sensing infrastructure to implement traffic-responsiveness. In this paper, we take a novel perspective and show that it is possible to use the already necessary sensing infrastructure to estimate the uncertain quantities in real time. Specifically, resorting to the store-and-forward model, we design a novel network-wide traffic-responsive strategy that estimates the occupancy and exogenous demand in each link, i.e., entering (exiting) vehicle flows at the origins (destinations) of the network or within links, in real time. Borrowing from optimal control theory, we design an optimal linear quadratic control scheme, consisting of a linear feedback term, of the occupancy of the road links, and a feedforward component, which accounts for the varying exogenous vehicle load on the network. Thereby, the resulting control scheme is a simple feedback-feedforward controller, which is fed with occupancy and exogenous demand estimates, and is suitable for real-time implementation. Numerical simulations for the urban traffic network of Chania, Greece, show that, for realistic surges in the exogenous demand, the proposed solution significantly outperforms tried-and-tested solutions that ignore the exogenous demand.
Distributed Optimal Coverage Control in Multi-agent Systems: Known and Unknown Environments
This paper introduces a novel approach to solve the coverage optimization problem in multi-agent systems. The proposed technique offers an optimal solution with a lower cost with respect to conventional Voronoi-based techniques by effectively handling the issue of agents remaining stationary in regions void of information using a ranking function. The proposed approach leverages a novel cost function for optimizing the agents coverage and the cost function eventually aligns with the conventional Voronoi-based cost function. Theoretical analyses are conducted to assure the asymptotic convergence of agents towards the optimal configuration. A distinguishing feature of this approach lies in its departure from the reliance on geometric methods that are characteristic of Voronoi-based approaches; hence can be implemented more simply. Remarkably, the technique is adaptive and applicable to various environments with both known and unknown information distributions. Lastly, the efficacy of the proposed method is demonstrated through simulations, and the obtained results are compared with those of Voronoi-based algorithms.
Comparative Evaluation of Learning Models for Bionic Robots: Non-Linear Transfer Function Identifications
The control and modeling of robot dynamics have increasingly adopted model-free control strategies using machine learning. Given the non-linear elastic nature of bionic robotic systems, learning-based methods provide reliable alternatives by utilizing numerical data to establish a direct mapping from actuation inputs to robot trajectories without complex kinematics models. However, for developers, the method of identifying an appropriate learning model for their specific bionic robots and further constructing the transfer function has not been thoroughly discussed. Thus, this research introduces a comprehensive evaluation strategy and framework for the application of model-free control, including data collection, learning model selection, comparative analysis, and transfer function identification to effectively deal with the multi-input multi-output (MIMO) robotic data.
comment: 12 pages, 21 figures, 1 table
Decentralized Robust Data-driven Predictive Control for Smoothing Mixed Traffic Flow
In a mixed traffic with connected automated vehicles (CAVs) and human-driven vehicles (HDVs) coexisting, data-driven predictive control of CAVs promises system-wide traffic performance improvements. Yet, most existing approaches focus on a centralized setup, which is not computationally scalable while failing to protect data privacy. The robustness against unknown disturbances has not been well addressed either, causing safety concerns. In this paper, we propose a decentralized robust DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control) approach for CAVs to smooth mixed traffic flow. In particular, each CAV computes its control input based on locally available data from its involved subsystem. Meanwhile, the interaction between neighboring subsystems is modeled as a bounded disturbance, for which appropriate estimation methods are proposed. Then, we formulate a robust optimization problem and present its tractable computational solutions. Compared with the centralized formulation, our method greatly reduces computation burden with better safety performance, while naturally preserving data privacy. Extensive traffic simulations validate its wave-dampening ability, safety performance, and computational benefits.
Robotics
Vehicle-in-Virtual-Environment Method for ADAS and Connected and Automated Driving Function Development/Demonstration/Evaluation
The current approach for new Advanced Driver Assistance System (ADAS) and Connected and Automated Driving (CAD) function development involves a significant amount of public road testing which is inefficient due to the number miles that need to be driven for rare and extreme events to take place, thereby being very costly also, and unsafe as the rest of the road users become involuntary test subjects. A new development, evaluation and demonstration method for safe, efficient, and repeatable development, demonstration and evaluation of ADAS and CAD functions called VehicleInVirtualEnvironment (VVE) was recently introduced as a solution to this problem. The vehicle is operated in a large, empty, and flat area during VVE while its localization and perception sensor data is fed from the virtual environment with other traffic and rare and extreme events being generated as needed. The virtual environment can be easily configured and modified to construct different testing scenarios on demand. This paper focuses on the VVE approach and introduces the coordinate transformations needed to sync pose (location and orientation) in the virtual and physical worlds and handling of localization and perception sensor data using the highly realistic 3D simulation model of a recent autonomous shuttle deployment site in Columbus, Ohio as the virtual world. As a further example that uses multiple actors, the use of VVE for VehicleToVRU communication based Vulnerable Road User (VRU) safety is presented in the paper using VVE experiments and real pedestrian(s) in a safe and repeatable manner. VVE experiments are used to demonstrate the efficacy of the method.
comment: 8 pages, 16 figures
PANav: Toward Privacy-Aware Robot Navigation via Vision-Language Models
Navigating robots discreetly in human work environments while considering the possible privacy implications of robotic tasks presents significant challenges. Such scenarios are increasingly common, for instance, when robots transport sensitive objects that demand high levels of privacy in spaces crowded with human activities. While extensive research has been conducted on robotic path planning and social awareness, current robotic systems still lack the functionality of privacy-aware navigation in public environments. To address this, we propose a new framework for mobile robot navigation that leverages vision-language models to incorporate privacy awareness into adaptive path planning. Specifically, all potential paths from the starting point to the destination are generated using the A* algorithm. Concurrently, the vision-language model is used to infer the optimal path for privacy-awareness, given the environmental layout and the navigational instruction. This approach aims to minimize the robot's exposure to human activities and preserve the privacy of the robot and its surroundings. Experimental results on the S3DIS dataset demonstrate that our framework significantly enhances mobile robots' privacy awareness of navigation in human-shared public environments. Furthermore, we demonstrate the practical applicability of our framework by successfully navigating a robotic platform through real-world office environments. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/privacy-aware-nav.
comment: 7 pages, 6 figures, conference
Compositional Diffusion Models for Powered Descent Trajectory Generation with Flexible Constraints
This work introduces TrajDiffuser, a compositional diffusion-based flexible and concurrent trajectory generator for 6 degrees of freedom powered descent guidance. TrajDiffuser is a statistical model that learns the multi-modal distributions of a dataset of simulated optimal trajectories, each subject to only one or few constraints that may vary for different trajectories. During inference, the trajectory is generated simultaneously over time, providing stable long-horizon planning, and constraints can be composed together, increasing the model's generalizability and decreasing the training data required. The generated trajectory is then used to initialize an optimizer, increasing its robustness and speed.
comment: Full manuscript submitted to IEEE Aerospace 2025 on 4-Oct-2024
Pareto Control Barrier Function for Inner Safe Set Maximization Under Input Constraints
This article introduces the Pareto Control Barrier Function (PCBF) algorithm to maximize the inner safe set of dynamical systems under input constraints. Traditional Control Barrier Functions (CBFs) ensure safety by maintaining system trajectories within a safe set but often fail to account for realistic input constraints. To address this problem, we leverage the Pareto multi-task learning framework to balance competing objectives of safety and safe set volume. The PCBF algorithm is applicable to high-dimensional systems and is computationally efficient. We validate its effectiveness through comparison with Hamilton-Jacobi reachability for an inverted pendulum and through simulations on a 12-dimensional quadrotor system. Results show that the PCBF consistently outperforms existing methods, yielding larger safe sets and ensuring safety under input constraints.
comment: Submitted to ACC 2025
Advancements in Robotics Process Automation: A Novel Model with Enhanced Empirical Validation and Theoretical Insights
Robotics Process Automation is revolutionizing business operations by significantly enhancing efficiency, productivity, and operational excellence across various industries. This manuscript delivers a comprehensive review of recent advancements in RPA technologies and proposes a novel model designed to elevate RPA capabilities.
comment: 9 pages
ETHcavation: A Dataset and Pipeline for Panoptic Scene Understanding and Object Tracking in Dynamic Construction Environments
Construction sites are challenging environments for autonomous systems due to their unstructured nature and the presence of dynamic actors, such as workers and machinery. This work presents a comprehensive panoptic scene understanding solution designed to handle the complexities of such environments by integrating 2D panoptic segmentation with 3D LiDAR mapping. Our system generates detailed environmental representations in real-time by combining semantic and geometric data, supported by Kalman Filter-based tracking for dynamic object detection. We introduce a fine-tuning method that adapts large pre-trained panoptic segmentation models for construction site applications using a limited number of domain-specific samples. For this use case, we release a first-of-its-kind dataset of 502 hand-labeled sample images with panoptic annotations from construction sites. In addition, we propose a dynamic panoptic mapping technique that enhances scene understanding in unstructured environments. As a case study, we demonstrate the system's application for autonomous navigation, utilizing real-time RRT* for reactive path planning in dynamic scenarios. The dataset (https://leggedrobotics.github.io/panoptic-scene-understanding.github.io/) and code (https://github.com/leggedrobotics/rsl_panoptic_mapping) for training and deployment are publicly available to support future research.
comment: 9 pages, 7 figures, 4 tables, submitted to 2024 Australasian Conference on Robotics and Automation (ACRA 2024)
A Framework for Reproducible Benchmarking and Performance Diagnosis of SLAM Systems IROS 2024
We propose SLAMFuse, an open-source SLAM benchmarking framework that provides consistent crossplatform environments for evaluating multi-modal SLAM algorithms, along with tools for data fuzzing, failure detection, and diagnosis across different datasets. Our framework introduces a fuzzing mechanism to test the resilience of SLAM algorithms against dataset perturbations. This enables the assessment of pose estimation accuracy under varying conditions and identifies critical perturbation thresholds. SLAMFuse improves diagnostics with failure detection and analysis tools, examining algorithm behaviour against dataset characteristics. SLAMFuse uses Docker to ensure reproducible testing conditions across diverse datasets and systems by streamlining dependency management. Emphasizing the importance of reproducibility and introducing advanced tools for algorithm evaluation and performance diagnosis, our work sets a new precedent for reliable benchmarking of SLAM systems. We provide ready-to-use docker compatible versions of the algorithms and datasets used in the experiments, together with guidelines for integrating and benchmarking new algorithms. Code is available at https://github.com/nikolaradulov/slamfuse
comment: 8 pages, 8 figures, Equal contribution of first two authors, Accepted at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Fast Object Detection with a Machine Learning Edge Device
This machine learning study investigates a lowcost edge device integrated with an embedded system having computer vision and resulting in an improved performance in inferencing time and precision of object detection and classification. A primary aim of this study focused on reducing inferencing time and low-power consumption and to enable an embedded device of a competition-ready autonomous humanoid robot and to support real-time object recognition, scene understanding, visual navigation, motion planning, and autonomous navigation of the robot. This study compares processors for inferencing time performance between a central processing unit (CPU), a graphical processing unit (GPU), and a tensor processing unit (TPU). CPUs, GPUs, and TPUs are all processors that can be used for machine learning tasks. Related to the aim of supporting an autonomous humanoid robot, there was an additional effort to observe whether or not there was a significant difference in using a camera having monocular vision versus stereo vision capability. TPU inference time results for this study reflect a 25% reduction in time over the GPU, and a whopping 87.5% reduction in inference time compared to the CPU. Much information in this paper is contributed to the final selection of Google's Coral brand, Edge TPU device. The Arduino Nano 33 BLE Sense Tiny ML Kit was also considered for comparison but due to initial incompatibilities and in the interest of time to complete this study, a decision was made to review the kit in a future experiment.
Trajectory elongation strategies with minimum curvature discontinuities for a Dubins vehicle
In this paper, we present strategies for designing curvature-bounded trajectories of any desired length between any two given oriented points. The proposed trajectory is constructed by the concatenation of three circular arcs of varying radii. Such a trajectory guarantees a complete coverage of the maximum set of reachable lengths while minimising the number of changeover points in the trajectory to a maximum of two under all scenarios. Additionally, by using the notion of internally tangent circles, we expand the set of Circle-Circle-Circle trajectories to eight kinds, consisting of {LLL, LLR, LRR, LRL, RRL, RLL, RLR, RRR} paths. The paper presents a mathematical formulation of the proposed trajectory and the conditions for the existence and classification of each kind of trajectory. We also analyse the variation of the length of the trajectory using suitable elongation strategies and derive the set of reachable lengths for all pairs of oriented points. Finally, the results of this paper are illustrated using numerical simulations.
comment: Preprint submitted to Automatica
High-Speed Stereo Visual SLAM for Low-Powered Computing Devices
We present an accurate and GPU-accelerated Stereo Visual SLAM design called Jetson-SLAM. It exhibits frame-processing rates above 60FPS on NVIDIA's low-powered 10W Jetson-NX embedded computer and above 200FPS on desktop-grade 200W GPUs, even in stereo configuration and in the multiscale setting. Our contributions are threefold: (i) a Bounded Rectification technique to prevent tagging many non-corner points as a corner in FAST detection, improving SLAM accuracy. (ii) A novel Pyramidal Culling and Aggregation (PyCA) technique that yields robust features while suppressing redundant ones at high speeds by harnessing a GPU device. PyCA uses our new Multi-Location Per Thread culling strategy (MLPT) and Thread-Efficient Warp-Allocation (TEWA) scheme for GPU to enable Jetson-SLAM achieving high accuracy and speed on embedded devices. (iii) Jetson-SLAM library achieves resource efficiency by having a data-sharing mechanism. Our experiments on three challenging datasets: KITTI, EuRoC, and KAIST-VIO, and two highly accurate SLAM backends: Full-BA and ICE-BA show that Jetson-SLAM is the fastest available accurate and GPU-accelerated SLAM system (Fig. 1).
Kalman Filter Applied To A Differential Robot
This document presents the study of the problem of location and trajectory that a robot must follow. It focuses on applying the Kalman filter to achieve location and trajectory estimation in an autonomous mobile differential robot. The experimental data was carried out through tests obtained with the help of two incremental encoders that are part of the construction of the differential robot. The data transmission is carried out from a PC where the control is carried out with the Matlab/Simulink software. The results are expressed in graphs showing the path followed by the robot using PI control, the estimator of the Kalman filter in a real system.
comment: 7 pages, 13 figures
Advancements in Robotics Process Automation: A Novel Model with Enhanced Empirical Validation and Theoretical Insights
Robotics Process Automation is revolutionizing business operations by significantly enhancing efficiency, productivity, and operational excellence across various industries. This manuscript delivers a comprehensive review of recent advancements in RPA technologies and proposes a novel model designed to elevate RPA capabilities.
comment: 9 pages. European Journal of Computer Science and Information Technology 2024
Sequential Gaussian Variational Inference for Nonlinear State Estimation applied to Robotic Applications
Probabilistic state estimation is essential for robots navigating uncertain environments. Accurately and efficiently managing uncertainty in estimated states is key to robust robotic operation. However, nonlinearities in robotic platforms pose significant challenges that require advanced estimation techniques. Gaussian variational inference (GVI) offers an optimization perspective on the estimation problem, providing analytically tractable solutions and efficiencies derived from the geometry of Gaussian space. We propose a Sequential Gaussian Variational Inference (S-GVI) method to address nonlinearity and provide efficient sequential inference processes. Our approach integrates sequential Bayesian principles into the GVI framework, which are addressed using statistical approximations and gradient updates on the information geometry. Validations through simulations and real-world experiments demonstrate significant improvements in state estimation over the Maximum A Posteriori (MAP) estimation method.
comment: 8 pages
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Robots' ability to follow language instructions and execute diverse 3D tasks is vital in robot learning. Traditional imitation learning-based methods perform well on seen tasks but struggle with novel, unseen ones due to variability. Recent approaches leverage large foundation models to assist in understanding novel tasks, thereby mitigating this issue. However, these methods lack a task-specific learning process, which is essential for an accurate understanding of 3D environments, often leading to execution failures. In this paper, we introduce GravMAD, a sub-goal-driven, language-conditioned action diffusion framework that combines the strengths of imitation learning and foundation models. Our approach breaks tasks into sub-goals based on language instructions, allowing auxiliary guidance during both training and inference. During training, we introduce Sub-goal Keypose Discovery to identify key sub-goals from demonstrations. Inference differs from training, as there are no demonstrations available, so we use pre-trained foundation models to bridge the gap and identify sub-goals for the current task. In both phases, GravMaps are generated from sub-goals, providing flexible 3D spatial guidance compared to fixed 3D positions. Empirical evaluations on RLBench show that GravMAD significantly outperforms state-of-the-art methods, with a 28.63% improvement on novel tasks and a 13.36% gain on tasks encountered during training. These results demonstrate GravMAD's strong multi-task learning and generalization in 3D manipulation. Video demonstrations are available at: https://gravmad.github.io.
comment: Under review. The first two authors contributed equally
Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning CodeLLMs
Open-weight LLMs are particularly appealing choices to generate training data for fine-tuning Code LLMs on domain-specific service robot applications because they are cost-effective, customizable, and offer better privacy protection. However, unlike proprietary LLMs, open-weight models are more error-prone and often produce programs that violate domain-specific constraints. A promising solution is to incorporate a robot simulator with a well-defined environment to verify program correctness. Yet, these environments require pre-enumeration of relevant entities and their states, which limits the diversity of programs that can be effectively verified. In this work, we introduce ROBO-INSTRUCT that preserves the diversity of programs generated by an LLM while providing the correctness of simulator-based checking. ROBO-INSTRUCT introduces ROBOSIM to dynamically synthesize consistent simulation environments for each generated program. Moreover, ROBO-INSTRUCT handles subtler instruction-program inconsistencies that do not result in a constraint violation via INSTALIGN, an LLM-aided instruction-program alignment process. Given domain-specific APIs and a few seed examples, ROBO-INSTRUCT can leverage an 8B Llama3 model to generate a training dataset for fine-tuning a 7B CodeLlama model. Our fine-tuned model achieves a 28.75% improvement in pass@1 over the original base model and a 13.75% improvement compared to its SELF-INSTRUCT-finetuned counterparts, even surpassing the performance of a few proprietary LLMs, such as GPT-3.5-Turbo and Gemini-Pro.
NOD-TAMP: Generalizable Long-Horizon Planning with Neural Object Descriptors
Solving complex manipulation tasks in household and factory settings remains challenging due to long-horizon reasoning, fine-grained interactions, and broad object and scene diversity. Learning skills from demonstrations can be an effective strategy, but such methods often have limited generalizability beyond training data and struggle to solve long-horizon tasks. To overcome this, we propose to synergistically combine two paradigms: Neural Object Descriptors (NODs) that produce generalizable object-centric features and Task and Motion Planning (TAMP) frameworks that chain short-horizon skills to solve multi-step tasks. We introduce NOD-TAMP, a TAMP-based framework that extracts short manipulation trajectories from a handful of human demonstrations, adapts these trajectories using NOD features, and composes them to solve broad long-horizon, contact-rich tasks. NOD-TAMP solves existing manipulation benchmarks with a handful of demonstrations and significantly outperforms prior NOD-based approaches on new tabletop manipulation tasks that require diverse generalization. Finally, we deploy NOD-TAMP on a number of real-world tasks, including tool-use and high-precision insertion. For more details, please visit https://nodtamp.github.io/.
3D Feature Distillation with Object-Centric Priors
Grounding natural language to the physical world is a ubiquitous topic with a wide range of applications in computer vision and robotics. Recently, 2D vision-language models such as CLIP have been widely popularized, due to their impressive capabilities for open-vocabulary grounding in 2D images. Recent works aim to elevate 2D CLIP features to 3D via feature distillation, but either learn neural fields that are scene-specific and hence lack generalization, or focus on indoor room scan data that require access to multiple camera views, which is not practical in robot manipulation scenarios. Additionally, related methods typically fuse features at pixel-level and assume that all camera views are equally informative. In this work, we show that this approach leads to sub-optimal 3D features, both in terms of grounding accuracy, as well as segmentation crispness. To alleviate this, we propose a multi-view feature fusion strategy that employs object-centric priors to eliminate uninformative views based on semantic information, and fuse features at object-level via instance segmentation masks. To distill our object-centric 3D features, we generate a large-scale synthetic multi-view dataset of cluttered tabletop scenes, spawning 15k scenes from over 3300 unique object instances, which we make publicly available. We show that our method reconstructs 3D CLIP features with improved grounding capacity and spatial consistency, while doing so from single-view RGB-D, thus departing from the assumption of multiple camera views at test time. Finally, we show that our approach can generalize to novel tabletop domains and be re-purposed for 3D instance segmentation without fine-tuning, and demonstrate its utility for language-guided robotic grasping in clutter.
Causality-Aware Transformer Networks for Robotic Navigation
Current research in Visual Navigation reveals opportunities for improvement. First, the direct adoption of RNNs and Transformers often overlooks the specific differences between Embodied AI and traditional sequential data modelling, potentially limiting its performance in Embodied AI tasks. Second, the reliance on task-specific configurations, such as pre-trained modules and dataset-specific logic, compromises the generalizability of these methods. We address these constraints by initially exploring the unique differences between Navigation tasks and other sequential data tasks through the lens of Causality, presenting a causal framework to elucidate the inadequacies of conventional sequential methods for Navigation. By leveraging this causal perspective, we propose Causality-Aware Transformer (CAT) Networks for Navigation, featuring a Causal Understanding Module to enhance the models's Environmental Understanding capability. Meanwhile, our method is devoid of task-specific inductive biases and can be trained in an End-to-End manner, which enhances the method's generalizability across various contexts. Empirical evaluations demonstrate that our methodology consistently surpasses benchmark performances across a spectrum of settings, tasks and simulation environments. Extensive ablation studies reveal that the performance gains can be attributed to the Causal Understanding Module, which demonstrates effectiveness and efficiency in both Reinforcement Learning and Supervised Learning settings.
Context-Conditional Navigation with a Learning-Based Terrain- and Robot-Aware Dynamics Model
In autonomous navigation settings, several quantities can be subject to variations. Terrain properties such as friction coefficients may vary over time depending on the location of the robot. Also, the dynamics of the robot may change due to, e.g., different payloads, changing the system's mass, or wear and tear, changing actuator gains or joint friction. An autonomous agent should thus be able to adapt to such variations. In this paper, we develop a novel probabilistic, terrain- and robot-aware forward dynamics model, termed TRADYN, which is able to adapt to the above-mentioned variations. It builds on recent advances in meta-learning forward dynamics models based on Neural Processes. We evaluate our method in a simulated 2D navigation setting with a unicycle-like robot and different terrain layouts with spatially varying friction coefficients. In our experiments, the proposed model exhibits lower prediction error for the task of long-horizon trajectory prediction, compared to non-adaptive ablation models. We also evaluate our model on the downstream task of navigation planning, which demonstrates improved performance in planning control-efficient paths by taking robot and terrain properties into account.
comment: \copyright 2023 IEEE. Accepted for publication in European Conference on Mobile Robots (ECMR), 2023. Version including corrections (see p. 8)
Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models AISTATS
In this paper, we learn dynamics models for parametrized families of dynamical systems with varying properties. The dynamics models are formulated as stochastic processes conditioned on a latent context variable which is inferred from observed transitions of the respective system. The probabilistic formulation allows us to compute an action sequence which, for a limited number of environment interactions, optimally explores the given system within the parametrized family. This is achieved by steering the system through transitions being most informative for the context variable. We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.
comment: Accepted for publication at the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, with supplementary material. Corrected version (see footnote on p. 6)
Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning ICANN 2023
Sequential decision making in the real world often requires finding a good balance of conflicting objectives. In general, there exist a plethora of Pareto-optimal policies that embody different patterns of compromises between objectives, and it is technically challenging to obtain them exhaustively using deep neural networks. In this work, we propose a novel multi-objective reinforcement learning (MORL) algorithm that trains a single neural network via policy gradient to approximately obtain the entire Pareto set in a single run of training, without relying on linear scalarization of objectives. The proposed method works in both continuous and discrete action spaces with no design change of the policy network. Numerical experiments in benchmark environments demonstrate the practicality and efficacy of our approach in comparison to standard MORL baselines.
comment: 23 pages, 16 figures. Accepted at ICANN 2023
SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation
Autonomous self-improving robots that interact and improve with experience are key to the real-world deployment of robotic systems. In this paper, we propose an online learning method, SELFI, that leverages online robot experience to rapidly fine-tune pre-trained control policies efficiently. SELFI applies online model-free reinforcement learning on top of offline model-based learning to bring out the best parts of both learning paradigms. Specifically, SELFI stabilizes the online learning process by incorporating the same model-based learning objective from offline pre-training into the Q-values learned with online model-free reinforcement learning. We evaluate SELFI in multiple real-world environments and report improvements in terms of collision avoidance, as well as more socially compliant behavior, measured by a human user study. SELFI enables us to quickly learn useful robotic behaviors with less human interventions such as pre-emptive behavior for the pedestrians, collision avoidance for small and transparent objects, and avoiding travel on uneven floor surfaces. We provide supplementary videos to demonstrate the performance of our fine-tuned policy on our project page.
comment: 20pages, 12 figures, 2 tables, Conference on Robot Learning 2024
Multiagent Systems
Coalescing Force of Group Pressure: Consensus in Nonlinear Opinion Dynamics
This work extends the recent opinion dynamics model from Cheng et al., emphasizing the role of group pressure in consensus formation. We generalize the findings to incorporate social influence algorithms with general time-varying, opinion-dependent weights and multidimensional opinions, beyond bounded confidence dynamics. We demonstrate that, with uniformly positive conformity levels, group pressure consistently drives consensus and provide a tighter estimate for the convergence rate. Unlike previous models, the common public opinion in our framework can assume arbitrary forms within the convex hull of current opinions, offering flexibility applicable to real-world scenarios such as opinion polls with random participant selection. This analysis provides deeper insights into how group pressure mechanisms foster consensus under diverse conditions.
Large Language Models can Achieve Social Balance
Social balance is a concept in sociology which states that if every three individuals in a population achieve certain structures of positive or negative interactions, then the whole population ends up in one faction of positive interactions or divided between two or more antagonistic factions. In this paper, we consider a group of interacting large language models (LLMs) and study how, after continuous interactions, they can achieve social balance. Across three different LLM models, we found that social balance depends on (i) whether interactions are updated based on "relationships", "appraisals", or "opinions"; (ii) whether agents update their interactions based on homophily or influence from their peers; and (iii) the number of simultaneous interactions the LLMs consider. When social balance is achieved, its particular structure of positive or negative interactions depends on these three conditions and are different across LLM models and sizes. The stability of interactions and the justification for their update also vary across models. Thus, social balance is driven by the pre-training and alignment particular to each LLM model.
YOLO-MARL: You Only LLM Once for Multi-agent Reinforcement Learning
Advancements in deep multi-agent reinforcement learning (MARL) have positioned it as a promising approach for decision-making in cooperative games. However, it still remains challenging for MARL agents to learn cooperative strategies for some game environments. Recently, large language models (LLMs) have demonstrated emergent reasoning capabilities, making them promising candidates for enhancing coordination among the agents. However, due to the model size of LLMs, it can be expensive to frequently infer LLMs for actions that agents can take. In this work, we propose You Only LLM Once for MARL (YOLO-MARL), a novel framework that leverages the high-level task planning capabilities of LLMs to improve the policy learning process of multi-agents in cooperative games. Notably, for each game environment, YOLO-MARL only requires one time interaction with LLMs in the proposed strategy generation, state interpretation and planning function generation modules, before the MARL policy training process. This avoids the ongoing costs and computational time associated with frequent LLMs API calls during training. Moreover, the trained decentralized normal-sized neural network-based policies operate independently of the LLM. We evaluate our method across three different environments and demonstrate that YOLO-MARL outperforms traditional MARL algorithms.
Systems and Control (CS)
Vehicle-in-Virtual-Environment Method for ADAS and Connected and Automated Driving Function Development/Demonstration/Evaluation
The current approach for new Advanced Driver Assistance System (ADAS) and Connected and Automated Driving (CAD) function development involves a significant amount of public road testing which is inefficient due to the number miles that need to be driven for rare and extreme events to take place, thereby being very costly also, and unsafe as the rest of the road users become involuntary test subjects. A new development, evaluation and demonstration method for safe, efficient, and repeatable development, demonstration and evaluation of ADAS and CAD functions called VehicleInVirtualEnvironment (VVE) was recently introduced as a solution to this problem. The vehicle is operated in a large, empty, and flat area during VVE while its localization and perception sensor data is fed from the virtual environment with other traffic and rare and extreme events being generated as needed. The virtual environment can be easily configured and modified to construct different testing scenarios on demand. This paper focuses on the VVE approach and introduces the coordinate transformations needed to sync pose (location and orientation) in the virtual and physical worlds and handling of localization and perception sensor data using the highly realistic 3D simulation model of a recent autonomous shuttle deployment site in Columbus, Ohio as the virtual world. As a further example that uses multiple actors, the use of VVE for VehicleToVRU communication based Vulnerable Road User (VRU) safety is presented in the paper using VVE experiments and real pedestrian(s) in a safe and repeatable manner. VVE experiments are used to demonstrate the efficacy of the method.
comment: 8 pages, 16 figures
Coalescing Force of Group Pressure: Consensus in Nonlinear Opinion Dynamics
This work extends the recent opinion dynamics model from Cheng et al., emphasizing the role of group pressure in consensus formation. We generalize the findings to incorporate social influence algorithms with general time-varying, opinion-dependent weights and multidimensional opinions, beyond bounded confidence dynamics. We demonstrate that, with uniformly positive conformity levels, group pressure consistently drives consensus and provide a tighter estimate for the convergence rate. Unlike previous models, the common public opinion in our framework can assume arbitrary forms within the convex hull of current opinions, offering flexibility applicable to real-world scenarios such as opinion polls with random participant selection. This analysis provides deeper insights into how group pressure mechanisms foster consensus under diverse conditions.
Decentralized Equitable Energy Access in Energy Communities
We address the issue of equitable energy access within an energy community consisting of members with diverse socioeconomic backgrounds, including varying income levels and differing capacities to access distributed energy resources such as solar power and storage systems. While optimal energy consumption scheduling is well-studied, integrating equity into decentralized real-time energy access remains under-explored. This paper formulates Equity-regarding Welfare Maximization (EqWM)--a welfare optimization energy scheduling subject to equity constraints. We further develop a decentralized implementation (D-EqWM) as a bi-level optimization, where a non-profit operator designs a community pricing policy aimed at maximizing overall welfare, subject to constraints that ensure equitable access. Community members, in turn, optimize their individual consumption based on these prices. We present the optimal pricing policy along with its key properties.
Compositional Diffusion Models for Powered Descent Trajectory Generation with Flexible Constraints
This work introduces TrajDiffuser, a compositional diffusion-based flexible and concurrent trajectory generator for 6 degrees of freedom powered descent guidance. TrajDiffuser is a statistical model that learns the multi-modal distributions of a dataset of simulated optimal trajectories, each subject to only one or few constraints that may vary for different trajectories. During inference, the trajectory is generated simultaneously over time, providing stable long-horizon planning, and constraints can be composed together, increasing the model's generalizability and decreasing the training data required. The generated trajectory is then used to initialize an optimizer, increasing its robustness and speed.
comment: Full manuscript submitted to IEEE Aerospace 2025 on 4-Oct-2024
A Two-Stage Optimization Method for Real-Time Parameterization of PV-Farm Digital Twin
Digital twins (DTs) are high-fidelity virtual models of physical systems. This paper details a novel two-stage optimization method for real-time parameterization of photovoltaic digital twins (PVDTs) using field measurements. Initially, the method estimates equivalent irradiance from PV power, voltage, and current data, eliminating the need for direct irradiance sensors. This is crucial for tuning the DT's parameters to actual environmental conditions, thereby improving power prediction accuracy. The second stage focuses on refining these parameters by minimizing discrepancies between measured and predicted outputs. This optimization utilizes the estimated equivalent irradiance as a model input, maintaining synchronization with real-world conditions. Parameter updates are event-trigger, launched when deviations exceed predefined thresholds. This strategy optimizes prediction accuracy and manages communication loads efficiently. Validated with extensive data from a PV farm, this approach outperforms existing methodologies in predictive accuracy and operational efficiency, significantly improving the performance DTs in real-time grid operations.
comment: 11 pages, 12 figures, 4 tables
Trajectory elongation strategies with minimum curvature discontinuities for a Dubins vehicle
In this paper, we present strategies for designing curvature-bounded trajectories of any desired length between any two given oriented points. The proposed trajectory is constructed by the concatenation of three circular arcs of varying radii. Such a trajectory guarantees a complete coverage of the maximum set of reachable lengths while minimising the number of changeover points in the trajectory to a maximum of two under all scenarios. Additionally, by using the notion of internally tangent circles, we expand the set of Circle-Circle-Circle trajectories to eight kinds, consisting of {LLL, LLR, LRR, LRL, RRL, RLL, RLR, RRR} paths. The paper presents a mathematical formulation of the proposed trajectory and the conditions for the existence and classification of each kind of trajectory. We also analyse the variation of the length of the trajectory using suitable elongation strategies and derive the set of reachable lengths for all pairs of oriented points. Finally, the results of this paper are illustrated using numerical simulations.
comment: Preprint submitted to Automatica
Development of a Mouse for Individuals Without Upper Limbs Using Arduino Technology
This project focuses on the design and construction of a prototype mouse based on the Arduino platform, intended for individuals without upper limbs to use computers more effectively. The prototype comprises a microcontroller responsible for processing signals from the MPU-6050 sensor, used as a reference for cursor position, and foot-operated buttons for right and left-click functions. Its design enables cursor control through head movements, providing users with an easy and intuitive way to interact with the computer's graphical interface. Feasibility testing was conducted through experimental trials, resulting in ideal accuracy and precision. These trials indicate that the device is viable for use in individuals without upper limbs.
comment: 6 pages, 9 figures
Kalman Filter Applied To A Differential Robot
This document presents the study of the problem of location and trajectory that a robot must follow. It focuses on applying the Kalman filter to achieve location and trajectory estimation in an autonomous mobile differential robot. The experimental data was carried out through tests obtained with the help of two incremental encoders that are part of the construction of the differential robot. The data transmission is carried out from a PC where the control is carried out with the Matlab/Simulink software. The results are expressed in graphs showing the path followed by the robot using PI control, the estimator of the Kalman filter in a real system.
comment: 7 pages, 13 figures
Compositional Planning for Logically Constrained Multi-Agent Markov Decision Processes
Designing control policies for large, distributed systems is challenging, especially in the context of critical, temporal logic based specifications (e.g., safety) that must be met with high probability. Compositional methods for such problems are needed for scalability, yet relying on worst-case assumptions for decomposition tends to be overly conservative. In this work, we use the framework of Constrained Markov Decision Processes (CMDPs) to provide an assume-guarantee based decomposition for synthesizing decentralized control policies, subject to logical constraints in a multi-agent setting. The returned policies are guaranteed to satisfy the constraints with high probability and provide a lower bound on the achieved objective reward. We empirically find the returned policies to achieve near-optimal rewards while enjoying an order of magnitude reduction in problem size and execution time.
comment: 6 pages, 1 figure, accepted for publication at the 63rd IEEE Conf. on Decision and Control (2024)
Smart Air Quality Monitoring for Automotive Workshop Environments
Air quality monitoring in automotive workshops is crucial for occupational health and regulatory compliance. This study presents the development of an environmental monitoring system based on Internet of Things (IoT) and Artificial Intelligence (AI) technologies. DHT-11 and MQ-135 sensors were employed to measure temperature, humidity, and toxic gas concentrations, with real-time data transmission to the ThingSpeak platform via the MQTT protocol. Machine learning algorithms, including Linear Regression, Decision Trees, and SVM, were applied to analyze the data and compute an air salubrity index based on Gaussian functions. The system proved effective in detecting pollutant peaks and issuing automatic alerts, significantly improving worker health and safety. Workshops that implemented the system reported greater regulatory compliance and reduced occupational risks. The study concludes that the combination of IoT and AI provides an efficient and replicable solution for environmental monitoring in industrial settings.
comment: 9 pages
Predicting DC-Link Capacitor Current Ripple in AC-DC Rectifier Circuits Using Fine-Tuned Large Language Models
Foundational Large Language Models (LLMs) such as GPT-3.5-turbo allow users to refine the model based on newer information, known as ``fine-tuning''. This paper leverages this ability to analyze AC-DC converter behaviors, focusing on the ripple current in DC-link capacitors. Capacitors degrade faster under high ripple currents, complicating life monitoring and necessitating preemptive replacements. Using minimal invasive noisy hardware measurements from a full bridge rectifier and 90W Power Factor Correction (PFC) boost converter, an LLM-based models to predict ripple content in DC-link currents was developed which demonstrated the LLMs' ability for near-accurate predictions. This study also highlights data requirements for precise nonlinear power electronic circuit parameter predictions to predict component degradation without any additional sensors. Furthermore, the proposed framework could be extended to any non-linear function mapping problem as well as estimating the capacitor Equivalent Series Resistance (ESR).
comment: 6 pages, 12 figures, conference
Algorithm for globally identifiable reparametrizations of ODEs
Structural global parameter identifiability indicates whether one can determine a parameter's value in an ODE model from given inputs and outputs. If a given model has parameters for which there is exactly one value, such parameters are called globally identifiable. Given an ODE model involving not globally identifiable parameters, first we transform the system into one with locally identifiable parameters. As a main contribution of this paper, then we present a procedure for replacing, if possible, the ODE model with an equivalent one that has globally identifiable parameters. We first derive this as an algorithm for one-dimensional ODE models and then reuse this approach for higher-dimensional models.
Symbolic-numeric algorithm for parameter estimation in discrete-time models with $\exp$
Dynamic models describe phenomena across scientific disciplines, yet to make these models useful in application the unknown parameter values of the models must be determined. Discrete-time dynamic models are widely used to model biological processes, but it is often difficult to determine these parameters. In this paper, we propose a symbolic-numeric approach for parameter estimation in discrete-time models that involve univariate non-algebraic (locally) analytic functions such as exp. We illustrate the performance (precision) of our approach by applying our approach to two archetypal discrete-time models in biology (the flour beetle 'LPA' model and discrete Lotka-Volterra competition model). Unlike optimization-based methods, our algorithm guarantees to find all solutions of the parameter values up to a specified precision given time-series data for the measured variables provided that there are finitely many parameter values that fit the data and that the used polynomial system solver can find all roots of the associated polynomial system with interval coefficients.
Optimal control of port-Hamiltonian systems: energy, entropy, and exergy
We consider irreversible and coupled reversible-irreversible nonlinear port-Hamiltonian systems and the respective sets of thermodynamic equilibria. In particular, we are concerned with optimal state transitions and output stabilization on finite-time horizons. We analyze a class of optimal control problems, where the performance functional can be interpreted as a linear combination of energy supply, entropy generation, or exergy supply. Our results establish the integral turnpike property towards the set of thermodynamic equilibria providing a rigorous connection of optimal system trajectories to optimal steady states. Throughout the paper, we illustrate our findings by means of two examples: a network of heat exchangers and a gas-piston system.
comment: 24 pages, 5 figures
Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models AISTATS
In this paper, we learn dynamics models for parametrized families of dynamical systems with varying properties. The dynamics models are formulated as stochastic processes conditioned on a latent context variable which is inferred from observed transitions of the respective system. The probabilistic formulation allows us to compute an action sequence which, for a limited number of environment interactions, optimally explores the given system within the parametrized family. This is achieved by steering the system through transitions being most informative for the context variable. We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.
comment: Accepted for publication at the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, with supplementary material. Corrected version (see footnote on p. 6)
Optimal Covariance Steering for Discrete-Time Linear Stochastic Systems
In this paper, we study the optimal control problem for steering the state covariance of a discrete-time linear stochastic system over a finite time horizon. First, we establish the existence and uniqueness of the optimal control law for a quadratic cost function. Then, we show the separation of the optimal mean and the covariance steering problems. We also develop efficient computational methods to solve for the optimal control law, which is identified as the solution to a semi-definite program. The effectiveness of the proposed approach is demonstrated through numerical examples. In the process, we also obtain some novel theoretical results for a matrix Riccati difference equation, which may be of independent interest.
Systems and Control (EESS)
Vehicle-in-Virtual-Environment Method for ADAS and Connected and Automated Driving Function Development/Demonstration/Evaluation
The current approach for new Advanced Driver Assistance System (ADAS) and Connected and Automated Driving (CAD) function development involves a significant amount of public road testing which is inefficient due to the number miles that need to be driven for rare and extreme events to take place, thereby being very costly also, and unsafe as the rest of the road users become involuntary test subjects. A new development, evaluation and demonstration method for safe, efficient, and repeatable development, demonstration and evaluation of ADAS and CAD functions called VehicleInVirtualEnvironment (VVE) was recently introduced as a solution to this problem. The vehicle is operated in a large, empty, and flat area during VVE while its localization and perception sensor data is fed from the virtual environment with other traffic and rare and extreme events being generated as needed. The virtual environment can be easily configured and modified to construct different testing scenarios on demand. This paper focuses on the VVE approach and introduces the coordinate transformations needed to sync pose (location and orientation) in the virtual and physical worlds and handling of localization and perception sensor data using the highly realistic 3D simulation model of a recent autonomous shuttle deployment site in Columbus, Ohio as the virtual world. As a further example that uses multiple actors, the use of VVE for VehicleToVRU communication based Vulnerable Road User (VRU) safety is presented in the paper using VVE experiments and real pedestrian(s) in a safe and repeatable manner. VVE experiments are used to demonstrate the efficacy of the method.
comment: 8 pages, 16 figures
Coalescing Force of Group Pressure: Consensus in Nonlinear Opinion Dynamics
This work extends the recent opinion dynamics model from Cheng et al., emphasizing the role of group pressure in consensus formation. We generalize the findings to incorporate social influence algorithms with general time-varying, opinion-dependent weights and multidimensional opinions, beyond bounded confidence dynamics. We demonstrate that, with uniformly positive conformity levels, group pressure consistently drives consensus and provide a tighter estimate for the convergence rate. Unlike previous models, the common public opinion in our framework can assume arbitrary forms within the convex hull of current opinions, offering flexibility applicable to real-world scenarios such as opinion polls with random participant selection. This analysis provides deeper insights into how group pressure mechanisms foster consensus under diverse conditions.
Decentralized Equitable Energy Access in Energy Communities
We address the issue of equitable energy access within an energy community consisting of members with diverse socioeconomic backgrounds, including varying income levels and differing capacities to access distributed energy resources such as solar power and storage systems. While optimal energy consumption scheduling is well-studied, integrating equity into decentralized real-time energy access remains under-explored. This paper formulates Equity-regarding Welfare Maximization (EqWM)--a welfare optimization energy scheduling subject to equity constraints. We further develop a decentralized implementation (D-EqWM) as a bi-level optimization, where a non-profit operator designs a community pricing policy aimed at maximizing overall welfare, subject to constraints that ensure equitable access. Community members, in turn, optimize their individual consumption based on these prices. We present the optimal pricing policy along with its key properties.
Compositional Diffusion Models for Powered Descent Trajectory Generation with Flexible Constraints
This work introduces TrajDiffuser, a compositional diffusion-based flexible and concurrent trajectory generator for 6 degrees of freedom powered descent guidance. TrajDiffuser is a statistical model that learns the multi-modal distributions of a dataset of simulated optimal trajectories, each subject to only one or few constraints that may vary for different trajectories. During inference, the trajectory is generated simultaneously over time, providing stable long-horizon planning, and constraints can be composed together, increasing the model's generalizability and decreasing the training data required. The generated trajectory is then used to initialize an optimizer, increasing its robustness and speed.
comment: Full manuscript submitted to IEEE Aerospace 2025 on 4-Oct-2024
A Two-Stage Optimization Method for Real-Time Parameterization of PV-Farm Digital Twin
Digital twins (DTs) are high-fidelity virtual models of physical systems. This paper details a novel two-stage optimization method for real-time parameterization of photovoltaic digital twins (PVDTs) using field measurements. Initially, the method estimates equivalent irradiance from PV power, voltage, and current data, eliminating the need for direct irradiance sensors. This is crucial for tuning the DT's parameters to actual environmental conditions, thereby improving power prediction accuracy. The second stage focuses on refining these parameters by minimizing discrepancies between measured and predicted outputs. This optimization utilizes the estimated equivalent irradiance as a model input, maintaining synchronization with real-world conditions. Parameter updates are event-trigger, launched when deviations exceed predefined thresholds. This strategy optimizes prediction accuracy and manages communication loads efficiently. Validated with extensive data from a PV farm, this approach outperforms existing methodologies in predictive accuracy and operational efficiency, significantly improving the performance DTs in real-time grid operations.
comment: 11 pages, 12 figures, 4 tables
Trajectory elongation strategies with minimum curvature discontinuities for a Dubins vehicle
In this paper, we present strategies for designing curvature-bounded trajectories of any desired length between any two given oriented points. The proposed trajectory is constructed by the concatenation of three circular arcs of varying radii. Such a trajectory guarantees a complete coverage of the maximum set of reachable lengths while minimising the number of changeover points in the trajectory to a maximum of two under all scenarios. Additionally, by using the notion of internally tangent circles, we expand the set of Circle-Circle-Circle trajectories to eight kinds, consisting of {LLL, LLR, LRR, LRL, RRL, RLL, RLR, RRR} paths. The paper presents a mathematical formulation of the proposed trajectory and the conditions for the existence and classification of each kind of trajectory. We also analyse the variation of the length of the trajectory using suitable elongation strategies and derive the set of reachable lengths for all pairs of oriented points. Finally, the results of this paper are illustrated using numerical simulations.
comment: Preprint submitted to Automatica
Development of a Mouse for Individuals Without Upper Limbs Using Arduino Technology
This project focuses on the design and construction of a prototype mouse based on the Arduino platform, intended for individuals without upper limbs to use computers more effectively. The prototype comprises a microcontroller responsible for processing signals from the MPU-6050 sensor, used as a reference for cursor position, and foot-operated buttons for right and left-click functions. Its design enables cursor control through head movements, providing users with an easy and intuitive way to interact with the computer's graphical interface. Feasibility testing was conducted through experimental trials, resulting in ideal accuracy and precision. These trials indicate that the device is viable for use in individuals without upper limbs.
comment: 6 pages, 9 figures
Kalman Filter Applied To A Differential Robot
This document presents the study of the problem of location and trajectory that a robot must follow. It focuses on applying the Kalman filter to achieve location and trajectory estimation in an autonomous mobile differential robot. The experimental data was carried out through tests obtained with the help of two incremental encoders that are part of the construction of the differential robot. The data transmission is carried out from a PC where the control is carried out with the Matlab/Simulink software. The results are expressed in graphs showing the path followed by the robot using PI control, the estimator of the Kalman filter in a real system.
comment: 7 pages, 13 figures
Compositional Planning for Logically Constrained Multi-Agent Markov Decision Processes
Designing control policies for large, distributed systems is challenging, especially in the context of critical, temporal logic based specifications (e.g., safety) that must be met with high probability. Compositional methods for such problems are needed for scalability, yet relying on worst-case assumptions for decomposition tends to be overly conservative. In this work, we use the framework of Constrained Markov Decision Processes (CMDPs) to provide an assume-guarantee based decomposition for synthesizing decentralized control policies, subject to logical constraints in a multi-agent setting. The returned policies are guaranteed to satisfy the constraints with high probability and provide a lower bound on the achieved objective reward. We empirically find the returned policies to achieve near-optimal rewards while enjoying an order of magnitude reduction in problem size and execution time.
comment: 6 pages, 1 figure, accepted for publication at the 63rd IEEE Conf. on Decision and Control (2024)
Smart Air Quality Monitoring for Automotive Workshop Environments
Air quality monitoring in automotive workshops is crucial for occupational health and regulatory compliance. This study presents the development of an environmental monitoring system based on Internet of Things (IoT) and Artificial Intelligence (AI) technologies. DHT-11 and MQ-135 sensors were employed to measure temperature, humidity, and toxic gas concentrations, with real-time data transmission to the ThingSpeak platform via the MQTT protocol. Machine learning algorithms, including Linear Regression, Decision Trees, and SVM, were applied to analyze the data and compute an air salubrity index based on Gaussian functions. The system proved effective in detecting pollutant peaks and issuing automatic alerts, significantly improving worker health and safety. Workshops that implemented the system reported greater regulatory compliance and reduced occupational risks. The study concludes that the combination of IoT and AI provides an efficient and replicable solution for environmental monitoring in industrial settings.
comment: 9 pages
Predicting DC-Link Capacitor Current Ripple in AC-DC Rectifier Circuits Using Fine-Tuned Large Language Models
Foundational Large Language Models (LLMs) such as GPT-3.5-turbo allow users to refine the model based on newer information, known as ``fine-tuning''. This paper leverages this ability to analyze AC-DC converter behaviors, focusing on the ripple current in DC-link capacitors. Capacitors degrade faster under high ripple currents, complicating life monitoring and necessitating preemptive replacements. Using minimal invasive noisy hardware measurements from a full bridge rectifier and 90W Power Factor Correction (PFC) boost converter, an LLM-based models to predict ripple content in DC-link currents was developed which demonstrated the LLMs' ability for near-accurate predictions. This study also highlights data requirements for precise nonlinear power electronic circuit parameter predictions to predict component degradation without any additional sensors. Furthermore, the proposed framework could be extended to any non-linear function mapping problem as well as estimating the capacitor Equivalent Series Resistance (ESR).
comment: 6 pages, 12 figures, conference
Algorithm for globally identifiable reparametrizations of ODEs
Structural global parameter identifiability indicates whether one can determine a parameter's value in an ODE model from given inputs and outputs. If a given model has parameters for which there is exactly one value, such parameters are called globally identifiable. Given an ODE model involving not globally identifiable parameters, first we transform the system into one with locally identifiable parameters. As a main contribution of this paper, then we present a procedure for replacing, if possible, the ODE model with an equivalent one that has globally identifiable parameters. We first derive this as an algorithm for one-dimensional ODE models and then reuse this approach for higher-dimensional models.
Symbolic-numeric algorithm for parameter estimation in discrete-time models with $\exp$
Dynamic models describe phenomena across scientific disciplines, yet to make these models useful in application the unknown parameter values of the models must be determined. Discrete-time dynamic models are widely used to model biological processes, but it is often difficult to determine these parameters. In this paper, we propose a symbolic-numeric approach for parameter estimation in discrete-time models that involve univariate non-algebraic (locally) analytic functions such as exp. We illustrate the performance (precision) of our approach by applying our approach to two archetypal discrete-time models in biology (the flour beetle 'LPA' model and discrete Lotka-Volterra competition model). Unlike optimization-based methods, our algorithm guarantees to find all solutions of the parameter values up to a specified precision given time-series data for the measured variables provided that there are finitely many parameter values that fit the data and that the used polynomial system solver can find all roots of the associated polynomial system with interval coefficients.
Optimal control of port-Hamiltonian systems: energy, entropy, and exergy
We consider irreversible and coupled reversible-irreversible nonlinear port-Hamiltonian systems and the respective sets of thermodynamic equilibria. In particular, we are concerned with optimal state transitions and output stabilization on finite-time horizons. We analyze a class of optimal control problems, where the performance functional can be interpreted as a linear combination of energy supply, entropy generation, or exergy supply. Our results establish the integral turnpike property towards the set of thermodynamic equilibria providing a rigorous connection of optimal system trajectories to optimal steady states. Throughout the paper, we illustrate our findings by means of two examples: a network of heat exchangers and a gas-piston system.
comment: 24 pages, 5 figures
Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models AISTATS
In this paper, we learn dynamics models for parametrized families of dynamical systems with varying properties. The dynamics models are formulated as stochastic processes conditioned on a latent context variable which is inferred from observed transitions of the respective system. The probabilistic formulation allows us to compute an action sequence which, for a limited number of environment interactions, optimally explores the given system within the parametrized family. This is achieved by steering the system through transitions being most informative for the context variable. We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.
comment: Accepted for publication at the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, with supplementary material. Corrected version (see footnote on p. 6)
Optimal Covariance Steering for Discrete-Time Linear Stochastic Systems
In this paper, we study the optimal control problem for steering the state covariance of a discrete-time linear stochastic system over a finite time horizon. First, we establish the existence and uniqueness of the optimal control law for a quadratic cost function. Then, we show the separation of the optimal mean and the covariance steering problems. We also develop efficient computational methods to solve for the optimal control law, which is identified as the solution to a semi-definite program. The effectiveness of the proposed approach is demonstrated through numerical examples. In the process, we also obtain some novel theoretical results for a matrix Riccati difference equation, which may be of independent interest.
Robotics
Learning Humanoid Locomotion over Challenging Terrain
Humanoid robots can, in principle, use their legs to go almost anywhere. Developing controllers capable of traversing diverse terrains, however, remains a considerable challenge. Classical controllers are hard to generalize broadly while the learning-based methods have primarily focused on gentle terrains. Here, we present a learning-based approach for blind humanoid locomotion capable of traversing challenging natural and man-made terrain. Our method uses a transformer model to predict the next action based on the history of proprioceptive observations and actions. The model is first pre-trained on a dataset of flat-ground trajectories with sequence modeling, and then fine-tuned on uneven terrain using reinforcement learning. We evaluate our model on a real humanoid robot across a variety of terrains, including rough, deformable, and sloped surfaces. The model demonstrates robust performance, in-context adaptation, and emergent terrain representations. In real-world case studies, our humanoid robot successfully traversed over 4 miles of hiking trails in Berkeley and climbed some of the steepest streets in San Francisco.
comment: Project page: https://humanoid-challenging-terrain.github.io
GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs
Robotic simulation today remains challenging to scale up due to the human efforts required to create diverse simulation tasks and scenes. Simulation-trained policies also face scalability issues as many sim-to-real methods focus on a single task. To address these challenges, this work proposes GenSim2, a scalable framework that leverages coding LLMs with multi-modal and reasoning capabilities for complex and realistic simulation task creation, including long-horizon tasks with articulated objects. To automatically generate demonstration data for these tasks at scale, we propose planning and RL solvers that generalize within object categories. The pipeline can generate data for up to 100 articulated tasks with 200 objects and reduce the required human efforts. To utilize such data, we propose an effective multi-task language-conditioned policy architecture, dubbed proprioceptive point-cloud transformer (PPT), that learns from the generated demonstrations and exhibits strong sim-to-real zero-shot transfer. Combining the proposed pipeline and the policy architecture, we show a promising usage of GenSim2 that the generated data can be used for zero-shot transfer or co-train with real-world collected data, which enhances the policy performance by 20% compared with training exclusively on limited real data.
comment: CoRL 2024. Project website: https://gensim2.github.io/
LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Videos
The world is filled with a wide variety of objects. For robots to be useful, they need the ability to find arbitrary objects described by people. In this paper, we present LeLaN(Learning Language-conditioned Navigation policy), a novel approach that consumes unlabeled, action-free egocentric data to learn scalable, language-conditioned object navigation. Our framework, LeLaN leverages the semantic knowledge of large vision-language models, as well as robotic foundation models, to label in-the-wild data from a variety of indoor and outdoor environments. We label over 130 hours of data collected in real-world indoor and outdoor environments, including robot observations, YouTube video tours, and human walking data. Extensive experiments with over 1000 real-world trials show that our approach enables training a policy from unlabeled action-free videos that outperforms state-of-the-art robot navigation methods, while being capable of inference at 4 times their speed on edge compute. We open-source our models, datasets and provide supplementary videos on our project page (https://learning-language-navigation.github.io/).
comment: 23 pages, 9 figures, 5 tables, Conference on Robot Learning 2024
Enhancing Autonomous Navigation by Imaging Hidden Objects using Single-Photon LiDAR
Robust autonomous navigation in environments with limited visibility remains a critical challenge in robotics. We present a novel approach that leverages Non-Line-of-Sight (NLOS) sensing using single-photon LiDAR to improve visibility and enhance autonomous navigation. Our method enables mobile robots to "see around corners" by utilizing multi-bounce light information, effectively expanding their perceptual range without additional infrastructure. We propose a three-module pipeline: (1) Sensing, which captures multi-bounce histograms using SPAD-based LiDAR; (2) Perception, which estimates occupancy maps of hidden regions from these histograms using a convolutional neural network; and (3) Control, which allows a robot to follow safe paths based on the estimated occupancy. We evaluate our approach through simulations and real-world experiments on a mobile robot navigating an L-shaped corridor with hidden obstacles. Our work represents the first experimental demonstration of NLOS imaging for autonomous navigation, paving the way for safer and more efficient robotic systems operating in complex environments. We also contribute a novel dynamics-integrated transient rendering framework for simulating NLOS scenarios, facilitating future research in this domain.
comment: Project webpage: https://github.com/camera-culture/nlos-aided-autonomous-navigation
Loading Ceramics: Visualising Possibilities of Robotics in Ceramics
This article introduces an artistic research project that utilises artist-in-residency and exhibition as methods for exploring the possibilities of robotic 3D printing and ceramics. The interdisciplinary project unites artists and architects to collaborate on a proposed curatorial concept and Do-It-With-Others (DIWO) technological development. Constraints include material, specifically local clay, production technique, namely 3D printing with a robotic arm, and kiln size, as well as an exhibition concept that is further elaborated in the next chapter. The pictorial presents four projects as case studies demonstrating how the creatives integrate these constraints into their processes. This integration leads to the subsequent refinement and customization of the robotic-ceramics interface, aligning with the practitioners' requirements through software development. The project's focus extends beyond artistic outcomes, aiming also to advance the pipeline of 3D robotic printing in clay, employing a digitally controlled material press that has been developed in-house, with its functionality refined through practice.
HMT-Grasp: A Hybrid Mamba-Transformer Approach for Robot Grasping in Cluttered Environments
Robot grasping, whether handling isolated objects, cluttered items, or stacked objects, plays a critical role in industrial and service applications. However, current visual grasp detection methods based on Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) struggle to adapt across various grasping scenarios due to the imbalance between local and global feature extraction. In this paper, we propose a novel hybrid Mamba-Transformer approach to address these challenges. Our method improves robotic visual grasping by effectively capturing both global and local information through the integration of Vision Mamba and parallel convolutional-transformer blocks. This hybrid architecture significantly improves adaptability, precision, and flexibility across various robotic tasks. To ensure a fair evaluation, we conducted extensive experiments on the Cornell, Jacquard, and OCID-Grasp datasets, ranging from simple to complex scenarios. Additionally, we performed both simulated and real-world robotic experiments. The results demonstrate that our method not only surpasses state-of-the-art techniques on standard grasping datasets but also delivers strong performance in both simulation and real-world robot applications.
GAP-RL: Grasps As Points for RL Towards Dynamic Object Grasping
Dynamic grasping of moving objects in complex, continuous motion scenarios remains challenging. Reinforcement Learning (RL) has been applied in various robotic manipulation tasks, benefiting from its closed-loop property. However, existing RL-based methods do not fully explore the potential for enhancing visual representations. In this letter, we propose a novel framework called Grasps As Points for RL (GAP-RL) to effectively and reliably grasp moving objects. By implementing a fast region-based grasp detector, we build a Grasp Encoder by transforming 6D grasp poses into Gaussian points and extracting grasp features as a higher-level abstraction than the original object point features. Additionally, we develop a Graspable Region Explorer for real-world deployment, which searches for consistent graspable regions, enabling smoother grasp generation and stable policy execution. To assess the performance fairly, we construct a simulated dynamic grasping benchmark involving objects with various complex motions. Experiment results demonstrate that our method effectively generalizes to novel objects and unseen dynamic motions compared to other baselines. Real-world experiments further validate the framework's sim-to-real transferability.
comment: Accepted by RA-L for further publication, may be unavailable or updated in the future
MO-DDN: A Coarse-to-Fine Attribute-based Exploration Agent for Multi-object Demand-driven Navigation NeurIPS 2024
The process of satisfying daily demands is a fundamental aspect of humans' daily lives. With the advancement of embodied AI, robots are increasingly capable of satisfying human demands. Demand-driven navigation (DDN) is a task in which an agent must locate an object to satisfy a specified demand instruction, such as ``I am thirsty.'' The previous study typically assumes that each demand instruction requires only one object to be fulfilled and does not consider individual preferences. However, the realistic human demand may involve multiple objects. In this paper, we introduce the Multi-object Demand-driven Navigation (MO-DDN) benchmark, which addresses these nuanced aspects, including multi-object search and personal preferences, thus making the MO-DDN task more reflective of real-life scenarios compared to DDN. Building upon previous work, we employ the concept of ``attribute'' to tackle this new task. However, instead of solely relying on attribute features in an end-to-end manner like DDN, we propose a modular method that involves constructing a coarse-to-fine attribute-based exploration agent (C2FAgent). Our experimental results illustrate that this coarse-to-fine exploration strategy capitalizes on the advantages of attributes at various decision-making levels, resulting in superior performance compared to baseline methods. Code and video can be found at https://sites.google.com/view/moddn.
comment: Accepted at NeurIPS 2024; 39 pages, 11 figures;
STREAMS: An Assistive Multimodal AI Framework for Empowering Biosignal Based Robotic Controls
End-effector based assistive robots face persistent challenges in generating smooth and robust trajectories when controlled by human's noisy and unreliable biosignals such as muscle activities and brainwaves. The produced endpoint trajectories are often jerky and imprecise to perform complex tasks such as stable robotic grasping. We propose STREAMS (Self-Training Robotic End-to-end Adaptive Multimodal Shared autonomy) as a novel framework leveraged deep reinforcement learning to tackle this challenge in biosignal based robotic control systems. STREAMS blends environmental information and synthetic user input into a Deep Q Learning Network (DQN) pipeline for an interactive end-to-end and self-training mechanism to produce smooth trajectories for the control of end-effector based robots. The proposed framework achieved a high-performance record of 98% in simulation with dynamic target estimation and acquisition without any pre-existing datasets. As a zero-shot sim-to-real user study with five participants controlling a physical robotic arm with noisy head movements, STREAMS (as an assistive mode) demonstrated significant improvements in trajectory stabilization, user satisfaction, and task performance reported as a success rate of 83% compared to manual mode which was 44% without any task support. STREAMS seeks to improve biosignal based assistive robotic controls by offering an interactive, end-to-end solution that stabilizes end-effector trajectories, enhancing task performance and accuracy.
S2C2A: A Flexible Task Space Planning and Control Strategy for Modular Soft Robot Arms
Modular soft robot arms (MSRAs) are composed of multiple independent modules connected in a sequence. Due to their modular structure and high degrees of freedom (DOFs), these modules can simultaneously bend at different angles in various directions, enabling complex deformation. This capability allows MSRAs to perform more intricate tasks than single module robots. However, the modular structure also induces challenges in accurate planning, modeling, and control. Nonlinearity, hysteresis, and gravity complicate the physical model, while the modular structure and increased DOFs further lead to accumulative errors along the sequence. To address these challenges, we propose a flexible task space planning and control strategy for MSRAs, named S2C2A (State to Configuration to Action). Our approach formulates an optimization problem, S2C (State to Configuration planning), which integrates various loss functions and a forward MSRA model to generate configuration trajectories based on target MSRA states. Given the model complexity, we leverage a biLSTM network as the forward model. Subsequently, a configuration controller C2A (Configuration to Action control) is implemented to follow the planned configuration trajectories, leveraging only inaccurate internal sensing feedback. Both a biLSTM network and a physical model are utilized for configuration control. We validated our strategy using a cable-driven MSRA, demonstrating its ability to perform diverse offline tasks such as position control, orientation control, and obstacle avoidance. Furthermore, our strategy endows MSRA with online interaction capability with targets and obstacles. Future work will focus on addressing MSRA challenges, such as developing more accurate physical models and reducing configuration estimation errors along the module sequence.
comment: 13 pages, 14 figures, 4 tables
A Compact, Low-cost Force and Torque Sensor for Robot Fingers with LED-based Displacement Sensing
Force/torque sensing is an important modality for robotic manipulation, but commodity solutions, generally developed with other applications in mind, do not generally fit the needs of robot hands. This paper introduces a novel method for six-axis force/torque sensing, using LEDs to sense the displacement between two plates connected by a transparent elastomer. Our method allows for finger-size packaging with no amplification electronics, low cost manufacturing, and easy integration into a complete hand. On test forces between 0-2 N, our prototype sensor exhibits a mean error between 0.05 and 0.07 N across the three force directions, suggesting future applicability to fine manipulation tasks.
Collision-Aware Traversability Analysis for Autonomous Vehicles in the Context of Agricultural Robotics
In this paper, we introduce a novel method for safe navigation in agricultural robotics. As global environmental challenges intensify, robotics offers a powerful solution to reduce chemical usage while meeting the increasing demands for food production. However, significant challenges remain in ensuring the autonomy and resilience of robots operating in unstructured agricultural environments. Obstacles such as crops and tall grass, which are deformable, must be identified as safely traversable, compared to rigid obstacles. To address this, we propose a new traversability analysis method based on a 3D spectral map reconstructed using a LIDAR and a multispectral camera. This approach enables the robot to distinguish between safe and unsafe collisions with deformable obstacles. We perform a comprehensive evaluation of multispectral metrics for vegetation detection and incorporate these metrics into an augmented environmental map. Utilizing this map, we compute a physics-based traversability metric that accounts for the robot's weight and size, ensuring safe navigation over deformable obstacles.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
A Service Robot in the Wild: Analysis of Users Intentions, Robot Behaviors, and Their Impact on the Interaction
We consider a service robot that offers chocolate treats to people passing in its proximity: it has the capability of predicting in advance a person's intention to interact, and to actuate an "offering" gesture, subtly extending the tray of chocolates towards a given target. We run the system for more than 5 hours across 3 days and two different crowded public locations; the system implements three possible behaviors that are randomly toggled every few minutes: passive (e.g. never performing the offering gesture); or active, triggered by either a naive distance-based rule, or a smart approach that relies on various behavioral cues of the user. We collect a real-world dataset that includes information on 1777 users with several spontaneous human-robot interactions and study the influence of robot actions on people's behavior. Our comprehensive analysis suggests that users are more prone to engage with the robot when it proactively starts the interaction. We release the dataset and provide insights to make our work reproducible for the community. Also, we report qualitative observations collected during the acquisition campaign and identify future challenges and research directions in the domain of social human-robot interaction.
Dynamic Curvature Constrained Path Planning
Effective path planning is a pivotal challenge across various domains, from robotics to logistics and beyond. This research is centred on the development and evaluation of the Dynamic Curvature-Constrained Path Planning Algorithm (DCCPPA) within two dimensional space. DCCPPA is designed to navigate constrained environments, optimising path solutions while accommodating curvature constraints.The study goes beyond algorithm development and conducts a comparative analysis with two established path planning methodologies: Rapidly Exploring Random Trees (RRT) and Probabilistic Roadmaps (PRM). These comparisons provide insights into the performance and adaptability of path planning algorithms across a range of applications.This research underscores the versatility of DCCPPA as a path planning algorithm tailored for 2D space, demonstrating its potential for addressing real-world path planning challenges across various domains. Index Terms Path Planning, PRM, RRT, Optimal Path, 2D Path Planning.
comment: 6 Pages, 3 figures, 3 tables
Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning ICRA 2025
Deep Reinforcement Learning (DRL) in simulation often results in brittle and unrealistic learning outcomes. To push the agent towards more desirable solutions, prior information can be injected in the learning process through, for instance, reward shaping, expert data, or motion primitives. We propose an additional inductive bias for robot learning: latent actions learned from expert demonstration as priors in the action space. We show that these action priors can be learned from only a single open-loop gait cycle using a simple autoencoder. Using these latent action priors combined with established style rewards for imitation in DRL achieves above expert demonstration level of performance and leads to more desirable gaits. Further, action priors substantially improve the performance on transfer tasks, even leading to gait transitions for higher target speeds. Videos and code are available at https://sites.google.com/view/latent-action-priors.
comment: Submitted to ICRA 2025
Sampling-Based Model Predictive Control for Volumetric Ablation in Robotic Laser Surgery ICRA 2025
Laser-based surgical ablation relies heavily on surgeon involvement, restricting precision to the limits of human error. The interaction between laser and tissue is governed by various laser parameters that control the laser irradiance on the tissue, including the laser power, distance, spot size, orientation, and exposure time. This complex interaction lends itself to robotic automation, allowing the surgeon to focus on high-level tasks, such as choosing the region and method of ablation, while the lower-level ablation plan can be handled autonomously. This paper describes a sampling-based model predictive control (MPC) scheme to plan ablation sequences for arbitrary tissue volumes. Using a steady-state point ablation model to simulate a single laser-tissue interaction, a random search technique explores the reachable state space while preserving sensitive tissue regions. The sampled MPC strategy provides an ablation sequence that accounts for parameter uncertainty without violating constraints, such as avoiding critical nerve bundles or blood vessels.
comment: 7 pages, 6 figures, submitted to IEEE ICRA 2025
Analysis and Detection of Differences in Spoken User Behaviors between Autonomous and Wizard-of-Oz Systems
This study examined users' behavioral differences in a large corpus of Japanese human-robot interactions, comparing interactions between a tele-operated robot and an autonomous dialogue system. We analyzed user spoken behaviors in both attentive listening and job interview dialogue scenarios. Results revealed significant differences in metrics such as speech length, speaking rate, fillers, backchannels, disfluencies, and laughter between operator-controlled and autonomous conditions. Furthermore, we developed predictive models to distinguish between operator and autonomous system conditions. Our models demonstrated higher accuracy and precision compared to the baseline model, with several models also achieving a higher F1 score than the baseline.
comment: Accepted and will be presented at the 27th conference of the Oriental COCOSDA (O-COCOSDA 2024)
Autoregressive Action Sequence Learning for Robotic Manipulation
Autoregressive models have demonstrated remarkable success in natural language processing. In this work, we design a simple yet effective autoregressive architecture for robotic manipulation tasks. We propose the Chunking Causal Transformer (CCT), which extends the next-single-token prediction of causal transformers to support multi-token prediction in a single pass. Further, we design a novel attention interleaving strategy that allows CCT to be trained efficiently with teacher-forcing. Based on CCT, we propose the Autoregressive Policy (ARP) model, which learns to generate action sequences autoregressively. We find that action sequence learning enables better leverage of the underlying causal relationships in robotic tasks. We evaluate ARP across diverse robotic manipulation environments, including Push-T, ALOHA, and RLBench, and show that it outperforms the state-of-the-art methods in all tested environments, while being more efficient in computation and parameter sizes. Video demonstrations, our source code, and the models of ARP can be found at http://github.com/mlzxy/arp.
Design and Evaluation of a Compliant Quasi Direct Drive End-effector for Safe Robotic Ultrasound Imaging
Robot-assisted ultrasound scanning promises to advance autonomous and accessible medical imaging. However, ensuring patient safety and compliant human-robot interaction (HRI) during probe contact poses a significant challenge. Most existing systems either have high mechanical stiffness or are compliant but lack sufficient force and precision. This paper presents a novel single-degree-of-freedom end-effector for safe and accurate robotic ultrasound imaging, using a quasi-direct drive actuator to achieve both passive mechanical compliance and precise active force regulation, even during motion. The end-effector demonstrates an effective force control bandwidth of 100 Hz and can apply forces ranging from 2.5N to 15N. To validate the end-effector's performance, we developed a novel ex vivo actuating platform, enabling compliance testing of the end-effector on simulated abdominal breathing and sudden patient movements. Experiments demonstrate that the end-effector can maintain consistent probe contact during simulated respiratory motion at 2.5N, 5N, 10N, and 15N, with an average force tracking RMS error of 0.83N compared to 4.70N on a UR3e robot arm using conventional force control. This system represents the first compliant ultrasound end-effector tested on a tissue platform simulating dynamic movement. The proposed solution provides a novel approach for designing and evaluating compliant robotic ultrasound systems, advancing the path for more compliant and patient-friendly robotic ultrasound systems in clinical settings.
Partial-to-Full Registration based on Gradient-SDF for Computer-Assisted Orthopedic Surgery
In computer-assisted orthopedic surgery (CAOS), accurate pre-operative to intra-operative bone registration is an essential and critical requirement for providing navigational guidance. This registration process is challenging since the intra-operative 3D points are sparse, only partially overlapped with the pre-operative model, and disturbed by noise and outliers. The commonly used method in current state-of-the-art orthopedic robotic system is bony landmarks based registration, but it is very time-consuming for the surgeons. To address these issues, we propose a novel partial-to-full registration framework based on gradient-SDF for CAOS. The simulation experiments using bone models from publicly available datasets and the phantom experiments performed under both optical tracking and electromagnetic tracking systems demonstrate that the proposed method can provide more accurate results than standard benchmarks and be robust to 90% outliers. Importantly, our method achieves convergence in less than 1 second in real scenarios and mean target registration error values as low as 2.198 mm for the entire bone model. Finally, it only requires random acquisition of points for registration by moving a surgical probe over the bone surface without correspondence with any specific bony landmarks, thus showing significant potential clinical value.
Residual Policy Learning for Perceptive Quadruped Control Using Differentiable Simulation
First-order Policy Gradient (FoPG) algorithms such as Backpropagation through Time and Analytical Policy Gradients leverage local simulation physics to accelerate policy search, significantly improving sample efficiency in robot control compared to standard model-free reinforcement learning. However, FoPG algorithms can exhibit poor learning dynamics in contact-rich tasks like locomotion. Previous approaches address this issue by alleviating contact dynamics via algorithmic or simulation innovations. In contrast, we propose guiding the policy search by learning a residual over a simple baseline policy. For quadruped locomotion, we find that the role of residual policy learning in FoPG-based training (FoPG RPL) is primarily to improve asymptotic rewards, compared to improving sample efficiency for model-free RL. Additionally, we provide insights on applying FoPG's to pixel-based local navigation, training a point-mass robot to convergence within seconds. Finally, we showcase the versatility of FoPG RPL by using it to train locomotion and perceptive navigation end-to-end on a quadruped in minutes.
Multi-Robot Motion Planning with Diffusion Models ICLR 2025
Diffusion models have recently been successfully applied to a wide range of robotics applications for learning complex multi-modal behaviors from data. However, prior works have mostly been confined to single-robot and small-scale environments due to the high sample complexity of learning multi-robot diffusion models. In this paper, we propose a method for generating collision-free multi-robot trajectories that conform to underlying data distributions while using only single-robot data. Our algorithm, Multi-robot Multi-model planning Diffusion (MMD), does so by combining learned diffusion models with classical search-based techniques -- generating data-driven motions under collision constraints. Scaling further, we show how to compose multiple diffusion models to plan in large environments where a single diffusion model fails to generalize well. We demonstrate the effectiveness of our approach in planning for dozens of robots in a variety of simulated scenarios motivated by logistics environments. View video demonstrations in our supplementary material, and our code at: https://github.com/yoraish/mmd.
comment: The first three authors contributed equally to this work. Under review for ICLR 2025
Hybrid Classical/RL Local Planner for Ground Robot Navigation
Local planning is an optimization process within a mobile robot navigation stack that searches for the best velocity vector, given the robot and environment state. Depending on how the optimization criteria and constraints are defined, some planners may be better than others in specific situations. We consider two conceptually different planners. The first planner explores the velocity space in real-time and has superior path-tracking and motion smoothness performance. The second planner was trained using reinforcement learning methods to produce the best velocity based on its training $"$experience$"$. It is better at avoiding dynamic obstacles but at the expense of motion smoothness. We propose a simple yet effective meta-reasoning approach that takes advantage of both approaches by switching between planners based on the surroundings. We demonstrate the superiority of our hybrid planner, both qualitatively and quantitatively, over the individual planners on a live robot in different scenarios, achieving an improvement of 26% in the navigation time.
CLIP-Clique: Graph-based Correspondence Matching Augmented by Vision Language Models for Object-based Global Localization
This letter proposes a method of global localization on a map with semantic object landmarks. One of the most promising approaches for localization on object maps is to use semantic graph matching using landmark descriptors calculated from the distribution of surrounding objects. These descriptors are vulnerable to misclassification and partial observations. Moreover, many existing methods rely on inlier extraction using RANSAC, which is stochastic and sensitive to a high outlier rate. To address the former issue, we augment the correspondence matching using Vision Language Models (VLMs). Landmark discriminability is improved by VLM embeddings, which are independent of surrounding objects. In addition, inliers are estimated deterministically using a graph-theoretic approach. We also incorporate pose calculation using the weighted least squares considering correspondence similarity and observation completeness to improve the robustness. We confirmed improvements in matching and pose estimation accuracy through experiments on ScanNet and TUM datasets.
comment: IEEE Robotics and Automation Letters
ROS2-Based Simulation Framework for Cyberphysical Security Analysis of UAVs
We present a new simulator of Uncrewed Aerial Vehicles (UAVs) that is tailored to the needs of testing cyber-physical security attacks and defenses. Recent investigations into UAV safety have unveiled various attack surfaces and some defense mechanisms. However, due to escalating regulations imposed by aviation authorities on security research on real UAVs, and the substantial costs associated with hardware test-bed configurations, there arises a necessity for a simulator capable of substituting for hardware experiments, and/or narrowing down their scope to the strictly necessary. The study of different attack mechanisms requires specific features in a simulator. We propose a simulation framework based on ROS2, leveraging some of its key advantages, including modularity, replicability, customization, and the utilization of open-source tools such as Gazebo. Our framework has a built-in motion planner, controller, communication models and attack models. We share examples of research use cases that our framework can enable, demonstrating its utility.
A Feasibility Study of a Soft, Low-Cost, 6-Axis Load Cell for Haptics
Haptic devices have shown to be valuable in supplementing surgical training, especially when providing haptic feedback based on user performance metrics such as wrench applied by the user on the tool. However, current 6-axis force/torque sensors are prohibitively expensive. This paper presents the design and calibration of a low-cost, six-axis force/torque sensor specially designed for laparoscopic haptic training applications. The proposed design uses Hall-effect sensors to measure the change in the position of magnets embedded in a silicone layer that results from an applied wrench to the device. Preliminary experimental validation demonstrates that these sensors can achieve an accuracy of 0.45 N and 0.014 Nm, and a theoretical XY range of +/-50N, Z range of +/-20N, and torque range of +/-0.2Nm. This study indicates that the proposed low-cost 6-axis force/torque sensor can accurately measure user force and provide useful feedback during laparoscopic training on a haptic device.
Online Control-Informed Learning
This paper proposes an Online Control-Informed Learning (OCIL) framework, which synthesizes the well-established control theories to solve a broad class of learning and control tasks in real time. This novel integration effectively handles practical issues in machine learning such as noisy measurement data, online learning, and data efficiency. By considering any robot as a tunable optimal control system, we propose an online parameter estimator based on extended Kalman filter (EKF) to incrementally tune the system in real time, enabling it to complete designated learning or control tasks. The proposed method also improves robustness in learning by effectively managing noise in the data. Theoretical analysis is provided to demonstrate the convergence and regret of OCIL. Three learning modes of OCIL, i.e. Online Imitation Learning, Online System Identification, and Policy Tuning On-the-fly, are investigated via experiments, which validate their effectiveness.
Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction
Differentiable simulation has become a powerful tool for system identification. While prior work has focused on identifying robot properties using robot-specific data or object properties using object-specific data, our approach calibrates object properties by using information from the robot, without relying on data from the object itself. Specifically, we utilize robot joint encoder information, which is commonly available in standard robotic systems. Our key observation is that by analyzing the robot's reactions to manipulated objects, we can infer properties of those objects, such as inertia and softness. Leveraging this insight, we develop differentiable simulations of robot-object interactions to inversely identify the properties of the manipulated objects. Our approach relies solely on proprioception -- the robot's internal sensing capabilities -- and does not require external measurement tools or vision-based tracking systems. This general method is applicable to any articulated robot and requires only joint position information. We demonstrate the effectiveness of our method on a low-cost robotic platform, achieving accurate mass and elastic modulus estimations of manipulated objects with just a few seconds of computation on a laptop.
Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis ICRA 2025
Exploration of unknown, unstructured environments, such as in search and rescue, cave exploration, and planetary missions,presents significant challenges due to their unpredictable nature. This unpredictability can lead to inefficient path planning and potential mission failures. We propose a multi-objective risk assessment method for exploration planning in such unconstrained environments. Our approach dynamically adjusts the weight of various risk factors to prevent the robot from undertaking lethal actions too early in the mission. By gradually increasing the allowable risk as the mission progresses, our method enables more efficient exploration. We evaluate risk based on environmental terrain properties, including elevation, slope, roughness, and traversability, and account for factors like battery life, mission duration, and travel distance. Our method is validated through experiments in various subterranean simulated cave environments. The results demonstrate that our approach ensures consistent exploration without incurring lethal actions, while introducing minimal computational overhead to the planning process.
comment: 7 pages, 8 figures, submitted to ICRA 2025
Improving Efficiency of Sampling-based Motion Planning via Message-Passing Monte Carlo
Sampling-based motion planning methods, while effective in high-dimensional spaces, often suffer from inefficiencies due to irregular sampling distributions, leading to suboptimal exploration of the configuration space. In this paper, we propose an approach that enhances the efficiency of these methods by utilizing low-discrepancy distributions generated through Message-Passing Monte Carlo (MPMC). MPMC leverages Graph Neural Networks (GNNs) to generate point sets that uniformly cover the space, with uniformity assessed using the the $\cL_p$-discrepancy measure, which quantifies the irregularity of sample distributions. By improving the uniformity of the point sets, our approach significantly reduces computational overhead and the number of samples required for solving motion planning problems. Experimental results demonstrate that our method outperforms traditional sampling techniques in terms of planning efficiency.
Safe Reference Tracking and Collision Avoidance for Taxiing Aircraft Using an MPC-CBF Framework
In this paper, we develop a framework for the automatic taxiing of aircraft between hangar and take-off given a graph-based model of an airport. We implement a high-level path-planning algorithm that models taxiway intersections as nodes in an undirected graph, algorithmically constructs a directed graph according to the physical limitations of the aircraft, and finds the shortest valid taxi path through the directed graph using Dijkstra's algorithm. We then use this shortest path to construct a reference trajectory for the aircraft to follow that considers the turning capabilities of a given aircraft. Using high-order control barrier functions (HOCBFs), we construct safety conditions for multi-obstacle avoidance and safe reference tracking for simple 2D unicycle dynamics with acceleration control inputs. We then use these safety conditions to design an MPC-CBF framework that tracks the reference trajectory while adhering to the safety constraints. We compare the performance of our MPC-CBF controller with a PID-CBF control method via simulations.
comment: This work is under review to be presented at the 2025 American Control Conference
Collaborative Safety-Critical Formation Control with Obstacle Avoidance
This work explores a collaborative method for ensuring safety in multi-agent formation control problems. We formulate a control barrier function (CBF) based safety filter control law for a generic distributed formation controller and extend our previously developed collaborative safety framework to an obstacle avoidance problem for agents with acceleration control inputs. We then incorporate multi-obstacle collision avoidance into the collaborative safety framework. This framework includes a method for computing the maximum capability of agents to satisfy their individual safety requirements. We analyze the convergence rate of our collaborative safety algorithm, and prove the linear-time convergence of cooperating agents to a jointly feasible safe action for all agents under the special case of a tree-structured communication network with a single obstacle for each agent. We illustrate the analytical results via simulation on a mass-spring kinematics-based formation controller and demonstrate the finite-time convergence of the collaborative safety algorithm in the simple proven case, the more general case of a fully-connected system with multiple static obstacles, and with dynamic obstacles.
comment: This work is under review for publication in Automatica. arXiv admin note: text overlap with arXiv:2311.11156
CyberCortex.AI: An AI-based Operating System for Autonomous Robotics and Complex Automation
The underlying framework for controlling autonomous robots and complex automation applications are Operating Systems (OS) capable of scheduling perception-and-control tasks, as well as providing real-time data communication to other robotic peers and remote cloud computers. In this paper, we introduce CyberCortex AI, a robotics OS designed to enable heterogeneous AI-based robotics and complex automation applications. CyberCortex AI is a decentralized distributed OS which enables robots to talk to each other, as well as to High Performance Computers (HPC) in the cloud. Sensory and control data from the robots is streamed towards HPC systems with the purpose of training AI algorithms, which are afterwards deployed on the robots. Each functionality of a robot (e.g. sensory data acquisition, path planning, motion control, etc.) is executed within a so-called DataBlock of Filters shared through the internet, where each filter is computed either locally on the robot itself, or remotely on a different robotic system. The data is stored and accessed via a so-called Temporal Addressable Memory (TAM), which acts as a gateway between each filter's input and output. CyberCortex AI has two main components: i) the CyberCortex AI inference system, which is a real-time implementation of the DataBlock running on the robots' embedded hardware, and ii) the CyberCortex AI dojo, which runs on an HPC computer in the cloud, and it is used to design, train and deploy AI algorithms. We present a quantitative and qualitative performance analysis of the proposed approach using two collaborative robotics applications: i) a forest fires prevention system based on an Unitree A1 legged robot and an Anafi Parrot 4K drone, as well as ii) an autonomous driving system which uses CyberCortex AI for collaborative perception and motion control.
Retrieval-Augmented Hierarchical in-Context Reinforcement Learning and Hindsight Modular Reflections for Task Planning with LLMs
Large Language Models (LLMs) have demonstrated remarkable abilities in various language tasks, making them promising candidates for decision-making in robotics. Inspired by Hierarchical Reinforcement Learning (HRL), we propose Retrieval-Augmented in-context reinforcement Learning (RAHL), a novel framework that decomposes complex tasks into sub-tasks using an LLM-based high-level policy, in which a complex task is decomposed into sub-tasks by a high-level policy on-the-fly. The sub-tasks, defined by goals, are assigned to the low-level policy to complete. To improve the agent's performance in multi-episode execution, we propose Hindsight Modular Reflection (HMR), where, instead of reflecting on the full trajectory, we let the agent reflect on shorter sub-trajectories to improve reflection efficiency. We evaluated the decision-making ability of the proposed RAHL in three benchmark environments--ALFWorld, Webshop, and HotpotQA. The results show that RAHL can achieve an improvement in performance in 9%, 42%, and 10% in 5 episodes of execution in strong baselines. Furthermore, we also implemented RAHL on the Boston Dynamics SPOT robot. The experiment shows that the robot can scan the environment, find entrances, and navigate to new rooms controlled by the LLM policy.
Motion Primitives Planning For Center-Articulated Vehicles
Autonomous navigation across unstructured terrains, including forests and construction areas, faces unique challenges due to intricate obstacles and the element of the unknown. Lacking pre-existing maps, these scenarios necessitate a motion planning approach that combines agility with efficiency. Critically, it must also incorporate the robot's kinematic constraints to navigate more effectively through complex environments. This work introduces a novel planning method for center-articulated vehicles (CAV), leveraging motion primitives within a receding horizon planning framework using onboard sensing. The approach commences with the offline creation of motion primitives, generated through forward simulations that reflect the distinct kinematic model of center-articulated vehicles. These primitives undergo evaluation through a heuristic-based scoring function, facilitating the selection of the most suitable path for real-time navigation. To account for disturbances, we develop a pose-stabilizing controller, tailored to the kinematic specifications of center-articulated vehicles. During experiments, our method demonstrates a $67\%$ improvement in SPL (Success Rate weighted by Path Length) performance over existing strategies. Furthermore, its efficacy was validated through real-world experiments conducted with a tree harvester vehicle - SAHA.
comment: 8 pages, 9 figures
One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion
Deep Reinforcement Learning techniques are achieving state-of-the-art results in robust legged locomotion. While there exists a wide variety of legged platforms such as quadruped, humanoids, and hexapods, the field is still missing a single learning framework that can control all these different embodiments easily and effectively and possibly transfer, zero or few-shot, to unseen robot embodiments. We introduce URMA, the Unified Robot Morphology Architecture, to close this gap. Our framework brings the end-to-end Multi-Task Reinforcement Learning approach to the realm of legged robots, enabling the learned policy to control any type of robot morphology. The key idea of our method is to allow the network to learn an abstract locomotion controller that can be seamlessly shared between embodiments thanks to our morphology-agnostic encoders and decoders. This flexible architecture can be seen as a potential first step in building a foundation model for legged robot locomotion. Our experiments show that URMA can learn a locomotion policy on multiple embodiments that can be easily transferred to unseen robot platforms in simulation and the real world.
Performance assessment of ADAS in a representative subset of critical traffic situations
As a variety of automated collision prevention systems gain presence within personal vehicles, rating and differentiating the automated safety performance of car models has become increasingly important for consumers, manufacturers, and insurers. In 2023, Swiss Re and partners initiated an eight-month long vehicle testing campaign conducted on a recognized UNECE type approval authority and Euro NCAP accredited proving ground in Germany. The campaign exposed twelve mass-produced vehicle models and one prototype vehicle fitted with collision prevention systems to a selection of safety-critical traffic scenarios representative of United States and European Union accident landscape. In this paper, we compare and evaluate the relative safety performance of these thirteen collision prevention systems (hardware and software stack) as demonstrated by this testing campaign. We first introduce a new scoring system which represents a test system's predicted impact on overall real-world collision frequency and reduction of collision impact energy, weighted based on the real-world relevance of the test scenario. Next, we introduce a novel metric that quantifies the realism of the protocol and confirm that our test protocol is a plausible representation of real-world driving. Finally, we find that the prototype system in its pre-release state outperforms the mass-produced (post-consumer-release) vehicles in the majority of the tested scenarios on the test track.
Diffusing in Someone Else's Shoes: Robotic Perspective Taking with Diffusion
Humanoid robots can benefit from their similarity to the human shape by learning from humans. When humans teach other humans how to perform actions, they often demonstrate the actions, and the learning human imitates the demonstration to get an idea of how to perform the action. Being able to mentally transfer from a demonstration seen from a third-person perspective to how it should look from a first-person perspective is fundamental for this ability in humans. As this is a challenging task, it is often simplified for robots by creating demonstrations from the first-person perspective. Creating these demonstrations allows for an easier imitation but requires more effort. Therefore, we introduce a novel diffusion model that enables the robot to learn from the third-person demonstrations directly by learning to generate the first-person perspective from the third-person perspective. The model translates the size and rotations of objects and the environment between the two perspectives. This allows us to utilise the benefits of easy-to-produce third-person demonstrations and easy-to-imitate first-person demonstrations.
comment: Submitted to Humanoids
Topology-Driven Parallel Trajectory Optimization in Dynamic Environments
Ground robots navigating in complex, dynamic environments must compute collision-free trajectories to avoid obstacles safely and efficiently. Nonconvex optimization is a popular method to compute a trajectory in real-time. However, these methods often converge to locally optimal solutions and frequently switch between different local minima, leading to inefficient and unsafe robot motion. In this work, We propose a novel topology-driven trajectory optimization strategy for dynamic environments that plans multiple distinct evasive trajectories to enhance the robot's behavior and efficiency. A global planner iteratively generates trajectories in distinct homotopy classes. These trajectories are then optimized by local planners working in parallel. While each planner shares the same navigation objectives, they are locally constrained to a specific homotopy class, meaning each local planner attempts a different evasive maneuver. The robot then executes the feasible trajectory with the lowest cost in a receding horizon manner. We demonstrate, on a mobile robot navigating among pedestrians, that our approach leads to faster and safer trajectories than existing planners.
comment: Accepted for publication in IEEE Transactions on Robotics
Artificial consciousness. Some logical and conceptual preliminaries
Is artificial consciousness theoretically possible? Is it plausible? If so, is it technically feasible? To make progress on these questions, it is necessary to lay some groundwork clarifying the logical and empirical conditions for artificial consciousness to arise and the meaning of relevant terms involved. Consciousness is a polysemic word: researchers from different fields, including neuroscience, Artificial Intelligence, robotics, and philosophy, among others, sometimes use different terms in order to refer to the same phenomena or the same terms to refer to different phenomena. In fact, if we want to pursue artificial consciousness, a proper definition of the key concepts is required. Here, after some logical and conceptual preliminaries, we argue for the necessity of using dimensions and profiles of consciousness for a balanced discussion about their possible instantiation or realisation in artificial systems. Our primary goal in this paper is to review the main theoretical questions that arise in the domain of artificial consciousness. On the basis of this review, we propose to assess the issue of artificial consciousness within a multidimensional account. The theoretical possibility of artificial consciousness is already presumed within some theoretical frameworks; however, empirical possibility cannot simply be deduced from these frameworks but needs independent empirical validation. We break down the complexity of consciousness by identifying constituents, components, and dimensions, and reflect pragmatically about the general challenges confronting the creation of artificial consciousness. Despite these challenges, we outline a research strategy for showing how "awareness" as we propose to understand it could plausibly be realised in artificial systems.
RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud
This work addresses limitations in recent 3D tracking-by-detection methods, focusing on identifying legitimate trajectories and addressing state estimation drift in Kalman filters. Current methods rely heavily on threshold-based filtering of false positive detections using detection scores to prevent ghost trajectories. However, this approach is inadequate for distant and partially occluded objects, where detection scores tend to drop, potentially leading to false positives exceeding the threshold. Additionally, the literature generally treats detections as precise localizations of objects. Our research reveals that noise in detections impacts localization information, causing trajectory drift for occluded objects and hindering recovery. To this end, we propose a novel online track validity mechanism that temporally distinguishes between legitimate and ghost tracks, along with a multi-stage observational gating process for incoming observations. This mechanism significantly improves tracking performance, with a $6.28\%$ in HOTA and a $17.87\%$ increase in MOTA. We also introduce a refinement to the Kalman filter that enhances noise mitigation in trajectory drift, leading to more robust state estimation for occluded objects. Our framework, RobMOT, outperforms state-of-the-art methods, including deep learning approaches, across various detectors, achieving up to a $4\%$ margin in HOTA and $6\%$ in MOTA. RobMOT excels under challenging conditions, such as prolonged occlusions and tracking distant objects, with up to a 59\% improvement in processing latency.
Safe and Efficient Trajectory Optimization for Autonomous Vehicles using B-spline with Incremental Path Flattening
Gradient-based trajectory optimization with B-spline curves is widely used for unmanned aerial vehicles (UAVs) due to its fast convergence and continuous trajectory generation. However, the application of B-spline curves for path-velocity coupled trajectory planning in autonomous vehicles (AVs) has been highly limited because it is challenging to reduce the over-approximation of the vehicle shape and to create a collision-free trajectory using B-spline curves while satisfying kinodynamic constraints. To address these challenges, this paper proposes novel disc-type swept volume (SV), incremental path flattening (IPF), and kinodynamic feasibility penalty methods. The disc-type SV estimation method is a new technique to reduce SV over-approximation and is used to find collision points for IPF. In IPF, the collision points are used to push the trajectory away from obstacles and to iteratively increase the curvature weight, thereby reducing SV and generating a collision-free trajectory. Additionally, to satisfy kinodynamic constraints for AVs using B-spline curves, we apply a clamped B-spline curvature penalty along with longitudinal and lateral velocity and acceleration penalties. Our experimental results demonstrate that our method outperforms state-of-the-art baselines in various simulated environments. We also conducted a real-world experiment using an AV, and our results validate the simulated tracking performance of the proposed approach.
comment: 16 pages, 21 figures, 5 tables, 3 algorithms
M2Distill: Multi-Modal Distillation for Lifelong Imitation Learning ICRA2025
Lifelong imitation learning for manipulation tasks poses significant challenges due to distribution shifts that occur in incremental learning steps. Existing methods often focus on unsupervised skill discovery to construct an ever-growing skill library or distillation from multiple policies, which can lead to scalability issues as diverse manipulation tasks are continually introduced and may fail to ensure a consistent latent space throughout the learning process, leading to catastrophic forgetting of previously learned skills. In this paper, we introduce M2Distill, a multi-modal distillation-based method for lifelong imitation learning focusing on preserving consistent latent space across vision, language, and action distributions throughout the learning process. By regulating the shifts in latent representations across different modalities from previous to current steps, and reducing discrepancies in Gaussian Mixture Model (GMM) policies between consecutive learning steps, we ensure that the learned policy retains its ability to perform previously learned tasks while seamlessly integrating new skills. Extensive evaluations on the LIBERO lifelong imitation learning benchmark suites, including LIBERO-OBJECT, LIBERO-GOAL, and LIBERO-SPATIAL, demonstrate that our method consistently outperforms prior state-of-the-art methods across all evaluated metrics.
comment: Submitted to ICRA2025
Flow as the Cross-Domain Manipulation Interface
We present Im2Flow2Act, a scalable learning framework that enables robots to acquire real-world manipulation skills without the need of real-world robot training data. The key idea behind Im2Flow2Act is to use object flow as the manipulation interface, bridging domain gaps between different embodiments (i.e., human and robot) and training environments (i.e., real-world and simulated). Im2Flow2Act comprises two components: a flow generation network and a flow-conditioned policy. The flow generation network, trained on human demonstration videos, generates object flow from the initial scene image, conditioned on the task description. The flow-conditioned policy, trained on simulated robot play data, maps the generated object flow to robot actions to realize the desired object movements. By using flow as input, this policy can be directly deployed in the real world with a minimal sim-to-real gap. By leveraging real-world human videos and simulated robot play data, we bypass the challenges of teleoperating physical robots in the real world, resulting in a scalable system for diverse tasks. We demonstrate Im2Flow2Act's capabilities in a variety of real-world tasks, including the manipulation of rigid, articulated, and deformable objects.
comment: Conference on Robot Learning 2024
Sample-efficient Imitative Multi-token Decision Transformer for Real-world Driving
Recent advancements in autonomous driving technologies involve the capability to effectively process and learn from extensive real-world driving data. Current imitation learning and offline reinforcement learning methods have shown remarkable promise in autonomous systems, harnessing the power of offline datasets to make informed decisions in open-loop (non-reactive agents) settings. However, learning-based agents face significant challenges when transferring knowledge from open-loop to closed-loop (reactive agents) environment. The performance is significantly impacted by data distribution shift, sample efficiency, the complexity of uncovering hidden world models and physics. To address these issues, we propose Sample-efficient Imitative Multi-token Decision Transformer (SimDT). SimDT introduces multi-token prediction, online imitative learning pipeline and prioritized experience replay to sequence-modelling reinforcement learning. The performance is evaluated through empirical experiments and results exceed popular imitation and reinforcement learning algorithms both in open-loop and closed-loop settings on Waymax benchmark. SimDT exhibits 41% reduction in collision rate and 18% improvement in reaching the destination compared with the baseline method.
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
LLMs with visual inputs, i.e., Vision Language Models (VLMs), have the capacity to process state information as visual-textual prompts and respond with policy decisions in text. We propose LLaRA: Large Language and Robotics Assistant, a framework that formulates robot action policy as conversations and provides improved action outputs when trained with auxiliary data that complements policy learning. We first introduce an automated pipeline to generate conversation-style instruction tuning data from existing behavior cloning data. Then we enrich the dataset in a self-supervised fashion by formulating six auxiliary tasks. A VLM finetuned with the resulting collection of datasets can generate meaningful robot action policy decisions. Our experiments across multiple simulated and real-world environments demonstrate the state-of-the-art performance of the proposed LLaRA framework. The code, datasets, and pretrained models are available at https://github.com/LostXine/LLaRA.
Roadmaps with Gaps over Controllers: Achieving Efficiency in Planning under Dynamics IROS
This paper aims to improve the computational efficiency of motion planning for mobile robots with non-trivial dynamics through the use of learned controllers. Offline, a system-specific controller is first trained in an empty environment. Then, for the target environment, the approach constructs a data structure, a "Roadmap with Gaps," to approximately learn how to solve planning queries using the learned controller. The roadmap nodes correspond to local regions. Edges correspond to applications of the learned controller that approximately connect these regions. Gaps arise as the controller does not perfectly connect pairs of individual states along edges. Online, given a query, a tree sampling-based motion planner uses the roadmap so that the tree's expansion is informed towards the goal region. The tree expansion selects local subgoals given a wavefront on the roadmap that guides towards the goal. When the controller cannot reach a subgoal region, the planner resorts to random exploration to maintain probabilistic completeness and asymptotic optimality. The accompanying experimental evaluation shows that the approach significantly improves the computational efficiency of motion planning on various benchmarks, including physics-based vehicular models on uneven and varying friction terrains as well as a quadrotor under air pressure effects.
comment: To be presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024. Website: https://prx-kinodynamic.github.io/projects/rogue
Quantifying Aleatoric and Epistemic Dynamics Uncertainty via Local Conformal Calibration
Whether learned, simulated, or analytical, approximations of a robot's dynamics can be inaccurate when encountering novel environments. Many approaches have been proposed to quantify the aleatoric uncertainty of such methods, i.e. uncertainty resulting from stochasticity, however these estimates alone are not enough to properly estimate the uncertainty of a model in a novel environment, where the actual dynamics can change. Such changes can induce epistemic uncertainty, i.e. uncertainty due to a lack of information/data. Accounting for both epistemic and aleatoric dynamics uncertainty in a theoretically-grounded way remains an open problem. We introduce Local Uncertainty Conformal Calibration (LUCCa), a conformal prediction-based approach that calibrates the aleatoric uncertainty estimates provided by dynamics models to generate probabilistically-valid prediction regions of the system's state. We account for both epistemic and aleatoric uncertainty non-asymptotically, without strong assumptions about the form of the true dynamics or how it changes. The calibration is performed locally in the state-action space, leading to uncertainty estimates that are useful for planning. We validate our method by constructing probabilistically-safe plans for a double-integrator under significant changes in dynamics.
comment: Accepted to the 16th International Workshop on the Algorithmic Foundations of Robotics (WAFR) 2024
Bayesian Online Learning for Human-assisted Target Localization
We consider a human-assisted autonomy sensor fusion for dynamic target localization in a Bayesian framework. Autonomous sensor-based tracking systems can suffer from observability and target detection failure. Humans possess valuable qualitative information derived from their past knowledge and rapid situational awareness that can give them an advantage over machine perception in many scenarios. To compensate for the shortcomings of an autonomous tracking system, we propose to collect spatial sensing information from human operators who visually monitor the target and can provide target localization information in the form of free sketches encircling the area where the target is located. However, human inputs cannot be taken deterministically and trusted absolutely due to their inherent subjectivity and variability. Our focus in this paper is to construct an adaptive probabilistic model for human-provided inputs where the adaptation terms capture the level of reliability of the human inputs. The next contribution of this paper is a novel joint Bayesian learning method to fuse human and autonomous sensor inputs in a manner that the dynamic changes in human detection reliability are also captured and accounted for. Unlike deep learning frameworks, a unique aspect of this Bayesian modeling framework is its analytical closed-form update equations. This feature provides computational efficiency and allows for online learning from limited data sets. Simulations demonstrate our results, underscoring the value of human-machine collaboration in autonomous systems.
comment: 7 figures
Solving Robotics Problems in Zero-Shot with Vision-Language Models
We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework designed to solve robotics problems in a zero-shot regime. In our context, zero-shot means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description, and the VLLM outputs the sequence of actions necessary for the robot to complete the task. Unlike prior work that requires fine-tuning parts of the pipeline -- such as adjusting an LLM on robot-specific data or training separate vision encoders -- our approach demonstrates that with careful engineering, a single off-the-shelf VLLM can autonomously handle all aspects of a robotics task, from high-level planning to low-level location extraction and action execution. Crucially, compared to using GPT-4o alone, Wonderful Team is self-corrective and capable of iteratively fixing its own mistakes, enabling it to solve challenging long-horizon tasks. We validate our framework through extensive experiments, both in simulated environments using VIMABench and in real-world settings. Our system showcases the ability to handle diverse tasks such as manipulation, goal-reaching, and visual reasoning -- all in a zero-shot manner. These results underscore a key point: vision-language models have progressed rapidly in the past year and should be strongly considered as a backbone for many robotics problems moving forward.
comment: aka Wonderful Team
ClutterGen: A Cluttered Scene Generator for Robot Learning
We introduce ClutterGen, a physically compliant simulation scene generator capable of producing highly diverse, cluttered, and stable scenes for robot learning. Generating such scenes is challenging as each object must adhere to physical laws like gravity and collision. As the number of objects increases, finding valid poses becomes more difficult, necessitating significant human engineering effort, which limits the diversity of the scenes. To overcome these challenges, we propose a reinforcement learning method that can be trained with physics-based reward signals provided by the simulator. Our experiments demonstrate that ClutterGen can generate cluttered object layouts with up to ten objects on confined table surfaces. Additionally, our policy design explicitly encourages the diversity of the generated scenes for open-ended generation. Our real-world robot results show that ClutterGen can be directly used for clutter rearrangement and stable placement policy training.
comment: Accepted by 8th Annual Conference on Robot Learning
Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks ICRA 2025
Mastering complex sequential tasks continues to pose a significant challenge in robotics. While there has been progress in learning long-horizon manipulation tasks, most existing approaches lack rigorous mathematical guarantees for ensuring reliable and successful execution. In this paper, we extend previous work on learning long-horizon tasks and stable policies, focusing on improving task success rates while reducing the amount of training data needed. Our approach introduces a novel method that (1) segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals, and (2) learns globally stable dynamical system policies to guide the robot to each subgoal, even in the face of sensory noise and random disturbances. We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms. Code is available at https://github.com/Alestaubin/stable-imitation-policy-with-waypoints
comment: 7 pages, submitted to ICRA 2025
Multiagent Systems
Distributed Networked Multi-task Learning
We consider a distributed multi-task learning scheme that accounts for multiple linear model estimation tasks with heterogeneous and/or correlated data streams. We assume that nodes can be partitioned into groups corresponding to different learning tasks and communicate according to a directed network topology. Each node estimates a linear model asynchronously and is subject to local (within-group) regularization and global (across groups) regularization terms targeting noise reduction and generalization performance improvement respectively. We provide a finite-time characterization of convergence of the estimators and task relation and illustrate the scheme's general applicability in two examples: random field temperature estimation and modeling student performance from different academic districts.
Multi-Robot Motion Planning with Diffusion Models ICLR 2025
Diffusion models have recently been successfully applied to a wide range of robotics applications for learning complex multi-modal behaviors from data. However, prior works have mostly been confined to single-robot and small-scale environments due to the high sample complexity of learning multi-robot diffusion models. In this paper, we propose a method for generating collision-free multi-robot trajectories that conform to underlying data distributions while using only single-robot data. Our algorithm, Multi-robot Multi-model planning Diffusion (MMD), does so by combining learned diffusion models with classical search-based techniques -- generating data-driven motions under collision constraints. Scaling further, we show how to compose multiple diffusion models to plan in large environments where a single diffusion model fails to generalize well. We demonstrate the effectiveness of our approach in planning for dozens of robots in a variety of simulated scenarios motivated by logistics environments. View video demonstrations in our supplementary material, and our code at: https://github.com/yoraish/mmd.
comment: The first three authors contributed equally to this work. Under review for ICLR 2025
Offline congestion games: How feedback type affects data coverage requirement
This paper investigates when one can efficiently recover an approximate Nash Equilibrium (NE) in offline congestion games. The existing dataset coverage assumption in offline general-sum games inevitably incurs a dependency on the number of actions, which can be exponentially large in congestion games. We consider three different types of feedback with decreasing revealed information. Starting from the facility-level (a.k.a., semi-bandit) feedback, we propose a novel one-unit deviation coverage condition and give a pessimism-type algorithm that can recover an approximate NE. For the agent-level (a.k.a., bandit) feedback setting, interestingly, we show the one-unit deviation coverage condition is not sufficient. On the other hand, we convert the game to multi-agent linear bandits and show that with a generalized data coverage assumption in offline linear bandits, we can efficiently recover the approximate NE. Lastly, we consider a novel type of feedback, the game-level feedback where only the total reward from all agents is revealed. Again, we show the coverage assumption for the agent-level feedback setting is insufficient in the game-level feedback setting, and with a stronger version of the data coverage assumption for linear bandits, we can recover an approximate NE. Together, our results constitute the first study of offline congestion games and imply formal separations between different types of feedback.
comment: 20 pages, 3 figures
Systems and Control (CS)
On the Cost of Consecutive Estimation Error: Significance-Aware Non-linear Aging
This paper considers the semantics-aware remote state estimation of an asymmetric Markov chain with prioritized states. Due to resource constraints, the sensor needs to trade between estimation quality and communication cost. The aim is to exploit the significance of information through the history of system realizations to determine the optimal timing of transmission, thereby reducing the amount of uninformative data transmitted in the network. To this end, we introduce a new metric, the significance-aware Age of Consecutive Error (AoCE), that captures two semantic attributes: the significance of estimation error and the cost of consecutive error. Different costs and non-linear age functions are assigned to different estimation errors to account for their relative importance to system performance. We identify the optimal transmission problem as a countably infinite state Markov decision process (MDP) with unbounded costs. We first give sufficient conditions on the age functions, source pattern, and channel reliability so that an optimal policy exists to have bounded average costs. We show that the optimal policy exhibits a switching structure. That is, the sensor triggers a transmission only when the system has been trapped in an error for a certain number of consecutive time slots. We also provide sufficient conditions under which the switching policy degenerates into a simple threshold policy, i.e., featuring identical thresholds for all estimation errors. Furthermore, we exploit the structural properties and develop a structured policy iteration (SPI) algorithm that considerably reduces computation overhead. Numerical results show that the optimal policy outperforms the classic rule-, distortion- and age-based policies. An important takeaway is that the more semantic attributes we utilize, the fewer transmissions are needed.
comment: This paper has been submitted for possible publication
HiL Demonstration of Online Battery Capacity and Impedance Estimation with Minimal a Priori Parametrization Effort
Uncertainty in the aging of batteries in battery electric vehicles impacts both the daily driving range as well as the expected economic lifetime. This paper presents a method to determine online the capacity and internal resistance of a battery cell based on real-world data. The method, based on a Joint Extended Kalman Filter combined with Recursive Least Squares, is computationally efficient and does not a priori require a fully characterized cell model. Offline simulation of the algorithm on data from differently aged cells shows convergence of the algorithm and indicates that capacity and resistance follow the expected trends. Furthermore, the algorithm is tested online on a Hardware-in-the-Loop setup to demonstrate real-time parameter updates in a realistic driving scenario.
comment: 6 pages, 9 figures, to be presented at VPPC 2024
Attainable Force Approximation and Full-Pose Tracking Control of an Over-Actuated Thrust-Vectoring Modular Team UAV
Traditional vertical take-off and landing (VTOL) aircraft can not achieve optimal efficiency for various payload weights and has limited mobility due to its under-actuation. With the thrust-vectoring mechanism, the proposed modular team UAV is fully actuated at certain attitudes. However, the attainable force space (AFS) differs according to the team configuration, which makes the controller design difficult. We propose an approximation to the AFS and a full-pose tracking controller with an attitude planner and a force projection, which guarantees the control force is feasible. The proposed approach can be applied to UAVs having multiple thrust-vectoring effectors with homogeneous agents. The simulation and experiment demonstrate a tilting motion during hovering for a 4-agent team.
A 9T4R RRAM-Based ACAM for Analogue Template Matching at the Edge
The continuous shift of computational bottlenecks to the memory access and data transfer, especially for AI applications, poses the urgent needs of re-engineering the computer architecture fundamentals. Many edge computing applications, like wearable and implantable medical devices, introduce increasingly more challenges to conventional computing systems due to the strict requirements of area and power at the edge. Emerging technologies, like Resistive RAM (RRAM), have shown a promising momentum in developing neuro-inspired analogue computing paradigms capable of achieving high classification capabilities alongside high energy efficiency. In this work, we present a novel RRAM-based Analogue Content Addressable Memory (ACAM) for on-line analogue template matching applications. This ACAM-based template matching architecture aims to achieve energy-efficient classification where low energy is of utmost importance. We are showcasing a highly tuneable novel RRAM-based ACAM pixel implemented using a commercial 180nm CMOS technology and in-house RRAM technology and exhibiting low energy dissipation of approximately 0.036pJ and 0.16pJ for mismatch and match, respectively, at 66MHz with 3V voltage supply. A proof-of-concept system-level implementation based on this novel pixel design is also implemented in 180nm.
Strategic Utilization of Cellular Operator Energy Storages for Smart Grid Frequency Regulation
The innovative use of cellular operator energy storage enhances smart grid resilience and efficiency. Traditionally used to ensure uninterrupted operation of cellular base stations (BSs) during grid outages, these storages can now dynamically participate in the energy flexibility market. This dual utilization enhances the economic viability of BS storage systems and supports sustainable energy management. In this paper, we explore the potential of BS storages for supporting grid ancillary services by allocating a portion of their capacity while ensuring Ultra Reliable Low Latency (URLLC) requirements, such as meeting delay and reliability requirements. This includes feeding BS stored energy back into the grid during high-demand periods or powering BSs to regulate grid frequency. We investigate the impacts of URLLC requirements on grid frequency regulation, formulating a joint resource allocation problem. This problem maximizes total revenues of cellular networks, considering both the total sum rate in the communication network and BS storages participation in frequency regulation, while considering battery aging and cycling constraints. Simulation results show that a network with 1500 BSs can increase power vacancy compensation from 31% to 46% by reducing reliability from 10^(-8) to 10^(-3). For a power vacancy of -30 MW, this varies from 9.3 MW to 13.5 MW, exceeding a wind turbines capacity.
Large Synthetic Datasets for Machine Learning Applications in Power Systems
With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access. This manuscript describes an algorithmic approach for generating large datasets of power injections in electric power grids. The method allows one to generate arbitrarily large time series from the knowledge of the grid -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated power consumption data, such as the national load data given by ENTSO-E. The obtained datasets are statistically validated against real-world data.
comment: 15 pages, 8 figures, 2 tables. Dataset available at https://zenodo.org/records/13378476
Enhanced Transformer architecture for in-context learning of dynamical systems
Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model is fed with an observed input/output sequence (context) generated by a real system to predict its behavior in a zero-shot learning fashion. In this paper, we enhance the original meta-modeling framework through three key innovations: by formulating the learning task within a probabilistic framework; by managing non-contiguous context and query windows; and by adopting recurrent patching to effectively handle long context sequences. The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class, highlighting the model's enhanced performance and scalability.
Simulated Eyeblink Artifact Removal with ICA: Effect of Measurement Uncertainty
Independent Component Analysis (ICA) is commonly-used in electroencephalogram (EEG) signal processing to remove non-cerebral artifacts from cerebral data. Despite the ubiquity of ICA, the effect of measurement uncertainty on the artifact removal process has not been thoroughly investigated. We first characterize the measurement uncertainty distribution of a common ADC and show that it quantitatively conforms to a Gaussian distribution. We then evaluate the effect of measurement uncertainty on the artifact identification process through several computer simulations. These computer simulations evaluate the performance of two different ICA algorithms, FastICA and Infomax, in removing eyeblink artifacts from five different electrode configurations with varying levels of measurement uncertainty. FastICA and Infomax show similar performance in identifying the eyeblink artifacts for a given uncertainty level and electrode configuration. We quantify the correlation performance degradation with respect to SNR and show that in general, an SNR of greater than 15 dB results in less than a 5% degradation in performance. The biggest difference in performance between the two algorithms is in their execution time. FastICA's execution time is dependent on the amount of measurement uncertainty, with a 50% to 85% reduction in execution time over an SNR range of 20 dB. This contrasts with Infomax's execution time, which is unaffected by measurement uncertainty.
comment: 8 pages, 9 figures
Online Bandit Nonlinear Control with Dynamic Batch Length and Adaptive Learning Rate
This paper is concerned with the online bandit nonlinear control, which aims to learn the best stabilizing controller from a pool of stabilizing and destabilizing controllers of unknown types for a given nonlinear dynamical system. We develop an algorithm, named Dynamic Batch length and Adaptive learning Rate (DBAR), and study its stability and regret. Unlike the existing Exp3 algorithm requiring an exponentially stabilizing controller, DBAR only needs a significantly weaker notion of controller stability, in which case substantial time may be required to certify the system stability. Dynamic batch length in DBAR effectively addresses this issue and enables the system to attain asymptotic stability, where the algorithm behaves as if there were no destabilizing controllers. Moreover, adaptive learning rate in DBAR only uses the state norm information to achieve a tight regret bound even when none of the stabilizing controllers in the pool are exponentially stabilizing.
comment: 38 pages, 7 figures
Optimized Topology Control for IoT Networks using Graph-based Localization
The key research question we are addressing in this paper, is how local distance information can be integrated into the global structure determination, in the form of network graphs realization for IoT networks. IoT networks will be pervading every walk of life over the next few years with the aim of improving quality of life and enhancing surrounding living conditions, while balancing available resources, like energy and computational power. As we deal with massive number of heterogeneous devices contributing to each IoT network, it is of paramount importance that the IoT network topology can be designed and controlled in such a way that coverage and throughput can be maximized using a minimum number of devices, while tackling challenges like poor link quality and interference. We tackle the above-mentioned problem of topology design and control through our designed graph-realization concept. End-nodes and gateways are identified and placed within neighborhood sub-graphs and their own coordinate system, which are stitched together to form the global graph. The stitching is done in a way that transmit power and information rate are optimized while reducing error probability.
Optimal Control in Both Steady State and Transient Process with Unknown Disturbances
The scheme of online optimization as a feedback controller is widely used to steer the states of a physical system to the optimal solution of a predefined optimization problem. Such methods focus on regulating the physical states to the optimal solution in the steady state, without considering the performance during the transient process. In this paper, we simultaneously consider the performance in both the steady state and the transient process of a linear time-invariant system with unknown disturbances. The performance of the transient process is illustrated by the concept of overtaking optimality. An overtaking optimal controller with known disturbances is derived to achieve the transient overtaking optimality while guaranteeing steady-state performance. Then, we propose a disturbance independent near-optimal controller, which can achieve optimal steady-state performance and approach the overtaking optimal performance in the transient process. The system performance gap between the overtaking optimal controller and the proposed controller proves to be inversely proportional to the control gains. A case study on a power system with four buses is used to validate the effectiveness of the two controllers.
A Policy Iteration Algorithm for N-player General-Sum Linear Quadratic Dynamic Games
We present a policy iteration algorithm for the infinite-horizon N-player general-sum deterministic linear quadratic dynamic games and compare it to policy gradient methods. We demonstrate that the proposed policy iteration algorithm is distinct from the Gauss-Newton policy gradient method in the N-player game setting, in contrast to the single-player setting where under suitable choice of step size they are equivalent. We illustrate in numerical experiments that the convergence rate of the proposed policy iteration algorithm significantly surpasses that of the Gauss-Newton policy gradient method and other policy gradient variations. Furthermore, our numerical results indicate that, compared to policy gradient methods, the convergence performance of the proposed policy iteration algorithm is less sensitive to the initial policy and changes in the number of players.
Optimization Proxies using Limited Labeled Data and Training Time -- A Semi-Supervised Bayesian Neural Network Approach
Constrained optimization problems arise in various engineering system operations such as inventory management and electric power grids. However, the requirement to repeatedly solve such optimization problems with uncertain parameters poses a significant computational challenge. This work introduces a learning scheme using Bayesian Neural Networks (BNNs) to solve constrained optimization problems under limited labeled data and restricted model training times. We propose a semi-supervised BNN for this practical but complex regime, wherein training commences in a sandwiched fashion, alternating between a supervised learning step (using labeled data) for minimizing cost, and an unsupervised learning step (using unlabeled data) for enforcing constraint feasibility. Both supervised and unsupervised steps use a Bayesian approach, where Stochastic Variational Inference is employed for approximate Bayesian inference. We show that the proposed semi-supervised learning method outperforms conventional BNN and deep neural network (DNN) architectures on important non-convex constrained optimization problems from energy network operations, achieving up to a tenfold reduction in expected maximum equality gap and halving the optimality and inequality (feasibility) gaps, without requiring any correction or projection step. By leveraging the BNN's ability to provide posterior samples at minimal computational cost, we demonstrate that a Selection via Posterior (SvP) scheme can further reduce equality gaps by more than 10%. We also provide tight and practically meaningful probabilistic confidence bounds that can be constructed using a low number of labeled testing data and readily adapted to other applications.
LEGO: QEC Decoding System Architecture for Dynamic Circuits
Quantum error correction (QEC) is a critical component of FTQC; the QEC decoder is an important part of Classical Computing for Quantum or C4Q. Recent years have seen fast development in real-time QEC decoders. Existing efforts to build real-time decoders have yet to achieve a critical milestone: decoding dynamic logical circuits with error-corrected readout and feed forward. Achieving this requires significant engineering effort to adapt and reconfigure the decoders during runtime, depending on the branching of the logical circuit. We present a QEC decoder architecture called LEGO, with the ambitious goal of supporting dynamic logical operations. LEGO employs a novel abstraction called the decoding block to describe the decoding problem of a dynamic logical circuit. Moreover, decoding blocks can be combined with three other ideas to improve the efficiency, accuracy and latency of the decoder. First, they provide data and task parallelisms when combined with fusion-based decoding. Second, they can exploit the pipeline parallelism inside multi-stage decoders. Finally, they serve as basic units of work for computational resource management. Using decoding blocks, LEGO can be easily reconfigured to support all QEC settings and to easily accommodate innovations in three interdependent fields: code, logical operations and qubit hardware. In contrast, existing decoders are highly specialized to a specific QEC setting, which leads to redundant research and engineering efforts, slows down innovation, and further fragments the nascent quantum computing industry.
Geometric Collaborative Filtering with Convergence
Latent variable collaborative filtering methods have been a standard approach to modelling user-click interactions due to their simplicity and effectiveness. However, there is limited work on analyzing the mathematical properties of these methods in particular on preventing the overfitting towards the identity, and such methods typically utilize loss functions that overlook the geometry between items. In this work, we introduce a notion of generalization gap in collaborative filtering and analyze this with respect to latent collaborative filtering models. We present a geometric upper bound that gives rise to loss functions, and a way to meaningfully utilize the geometry of item-metadata to improve recommendations. We show how these losses can be minimized and gives the recipe to a new latent collaborative filtering algorithm, which we refer to as GeoCF, due to the geometric nature of our results. We then show experimentally that our proposed GeoCF algorithm can outperform other all existing methods on the Movielens20M and Netflix datasets, as well as two large-scale internal datasets. In summary, our work proposes a theoretically sound method which paves a way to better understand generalization of collaborative filtering at large.
comment: 13 pages, 1 figure, 3 tables
Online Control-Informed Learning
This paper proposes an Online Control-Informed Learning (OCIL) framework, which synthesizes the well-established control theories to solve a broad class of learning and control tasks in real time. This novel integration effectively handles practical issues in machine learning such as noisy measurement data, online learning, and data efficiency. By considering any robot as a tunable optimal control system, we propose an online parameter estimator based on extended Kalman filter (EKF) to incrementally tune the system in real time, enabling it to complete designated learning or control tasks. The proposed method also improves robustness in learning by effectively managing noise in the data. Theoretical analysis is provided to demonstrate the convergence and regret of OCIL. Three learning modes of OCIL, i.e. Online Imitation Learning, Online System Identification, and Policy Tuning On-the-fly, are investigated via experiments, which validate their effectiveness.
A Machine Learning-Based Reference Governor for Nonlinear Systems With Application to Automotive Fuel Cells
The prediction-based nonlinear reference governor (PRG) is an add-on algorithm to enforce constraints on pre-stabilized nonlinear systems by modifying, whenever necessary, the reference signal. The implementation of PRG carries a heavy computational burden, as it may require multiple numerical simulations of the plant model at each sample time. To this end, this paper proposes an alternative approach based on machine learning, where we first use a regression neural network (NN) to approximate the input-output map of the PRG from a set of training data. During the real-time operation, at each sample time, we use the trained NN to compute a nominal reference command, which may not be constraint admissible due to training errors and limited data. We adopt a novel sensitivity-based approach to minimally adjust the nominal reference while ensuring constraint enforcement. We thus refer to the resulting control strategy as the modified neural network reference governor (MNN-RG), which is significantly more computationally efficient than the PRG. The computational and theoretical properties of MNN-RG are presented. Finally, the effectiveness and limitations of the proposed method are studied by applying it as a load governor for constraint management in automotive fuel cell systems through simulation-based case studies.
Safe Reference Tracking and Collision Avoidance for Taxiing Aircraft Using an MPC-CBF Framework
In this paper, we develop a framework for the automatic taxiing of aircraft between hangar and take-off given a graph-based model of an airport. We implement a high-level path-planning algorithm that models taxiway intersections as nodes in an undirected graph, algorithmically constructs a directed graph according to the physical limitations of the aircraft, and finds the shortest valid taxi path through the directed graph using Dijkstra's algorithm. We then use this shortest path to construct a reference trajectory for the aircraft to follow that considers the turning capabilities of a given aircraft. Using high-order control barrier functions (HOCBFs), we construct safety conditions for multi-obstacle avoidance and safe reference tracking for simple 2D unicycle dynamics with acceleration control inputs. We then use these safety conditions to design an MPC-CBF framework that tracks the reference trajectory while adhering to the safety constraints. We compare the performance of our MPC-CBF controller with a PID-CBF control method via simulations.
comment: This work is under review to be presented at the 2025 American Control Conference
Collaborative Safety-Critical Formation Control with Obstacle Avoidance
This work explores a collaborative method for ensuring safety in multi-agent formation control problems. We formulate a control barrier function (CBF) based safety filter control law for a generic distributed formation controller and extend our previously developed collaborative safety framework to an obstacle avoidance problem for agents with acceleration control inputs. We then incorporate multi-obstacle collision avoidance into the collaborative safety framework. This framework includes a method for computing the maximum capability of agents to satisfy their individual safety requirements. We analyze the convergence rate of our collaborative safety algorithm, and prove the linear-time convergence of cooperating agents to a jointly feasible safe action for all agents under the special case of a tree-structured communication network with a single obstacle for each agent. We illustrate the analytical results via simulation on a mass-spring kinematics-based formation controller and demonstrate the finite-time convergence of the collaborative safety algorithm in the simple proven case, the more general case of a fully-connected system with multiple static obstacles, and with dynamic obstacles.
comment: This work is under review for publication in Automatica. arXiv admin note: text overlap with arXiv:2311.11156
Universal Global State Estimation for Inertial Navigation Systems
This paper addresses the problem of accurate pose estimation (position, velocity, and orientation) for a rigid body. By utilizing generic exteroceptive measurements in combination with an Inertial Measurement Unit (IMU), we reformulate the vehicle's dynamics and outputs to fit within a linear time-varying (LTV) framework. This transformation enables the application of a linear continuous-time Kalman filter, thereby avoiding the complexities of nonlinear estimators and local Kalman-type filtering methods (e.g., EKF). We perform a complete uniform observability analysis for key benchmark problems (e.g., GPS-INS and Landmark-INS) and derive sufficient conditions for ensuring global uniform exponential stability. Simulations are conducted for two practical applications: stereo-aided inertial navigation systems (INS) with both constant and time-varying gains, as well as GPS-aided INS. The proposed approach notably simplifies observer design for INS.
comment: 8 pages
Sim-to-Real Multirotor Controller Single-shot Learning
This paper demonstrates the sim-to-real capabilities of retrospective cost optimization-based adaptive control for multirotor stabilization and trajectory-tracking problems. First, a continuous-time version of the widely used discrete-time retrospective control adaptive control algorithm is developed. Next, a computationally inexpensive 12-degree-of-freedom model of a multirotor is used to learn the control system in a simulation environment with a single trajectory. Finally, the performance of the learned controller is verified in a complex and realistic multirotor model in simulation and with a physical quadcopter in a waypoint command and a helical trajectory command.
Enhanced Digital Twin for Human-Centric and Integrated Lighting Asset Management in Public Libraries: From Corrective to Predictive Maintenance
Lighting asset management in public libraries has traditionally been reactive, focusing on corrective maintenance, addressing issues only when failures occur. Although standards now encourage preventive measures, such as incorporating a maintenance factor, the broader goal of human centric, sustainable lighting systems requires a shift toward predictive maintenance strategies. This study introduces an enhanced digital twin model designed for the proactive management of lighting assets in public libraries. By integrating descriptive, diagnostic, predictive, and prescriptive analytics, the model enables a comprehensive, multilevel view of asset health. The proposed framework supports both preventive and predictive maintenance strategies, allowing for early detection of issues and the timely resolution of potential failures. In addition to the specific application for lighting systems, the design is adaptable for other building assets, providing a scalable solution for integrated asset management in various public spaces.
Probabilistic forecasting of power system imbalance using neural network-based ensembles
Keeping the balance between electricity generation and consumption is becoming increasingly challenging and costly, mainly due to the rising share of renewables, electric vehicles and heat pumps and electrification of industrial processes. Accurate imbalance forecasts, along with reliable uncertainty estimations, enable transmission system operators (TSOs) to dispatch appropriate reserve volumes, reducing balancing costs. Further, market parties can use these probabilistic forecasts to design strategies that exploit asset flexibility to help balance the grid, generating revenue with known risks. Despite its importance, literature regarding system imbalance (SI) forecasting is limited. Further, existing methods do not focus on situations with high imbalance magnitude, which are crucial to forecast accurately for both TSOs and market parties. Hence, we propose an ensemble of C-VSNs, which are our adaptation of variable selection networks (VSNs). Each minute, our model predicts the imbalance of the current and upcoming two quarter-hours, along with uncertainty estimations on these forecasts. We evaluate our approach by forecasting the imbalance of Belgium, where high imbalance magnitude is defined as $|$SI$| > 500\,$MW (occurs 1.3% of the time in Belgium). For high imbalance magnitude situations, our model outperforms the state-of-the-art by 23.4% (in terms of continuous ranked probability score (CRPS), which evaluates probabilistic forecasts), while also attaining a 6.5% improvement in overall CRPS. Similar improvements are achieved in terms of root-mean-squared error. Additionally, we developed a fine-tuning methodology to effectively include new inputs with limited history in our model. This work was performed in collaboration with Elia (the Belgian TSO) to further improve their imbalance forecasts, demonstrating the relevance of our work.
Parallelized Robust Distributed Model Predictive Control in the Presence of Coupled State Constraints
In this paper, we present a robust distributed model predictive control (DMPC) scheme for dynamically decoupled nonlinear systems which are subject to state constraints, coupled state constraints and input constraints. In the proposed control scheme, all subsystems solve their local optimization problem in parallel and neighbor-to-neighbor communication suffices. The approach relies on consistency constraints which define a neighborhood around each subsystem's reference trajectory where the state of the subsystem is guaranteed to stay in. Reference trajectories and consistency constraints are known to neighboring subsystems. Contrary to other relevant approaches, the reference trajectories are improved consecutively. The presented approach allows the formulation of convex optimization problems for systems with linear dynamics even in the presence of non-convex state constraints. Additionally, we employ tubes in order to ensure the controller's robustness against bounded uncertainties. In the end, we briefly comment on an iterative extension of the DMPC scheme. The effectiveness of the proposed DMPC scheme and its iterative extension are demonstrated with simulations.
comment: 16 pages, 5 figures, preprint to be published in Automatica
Data-Enabled Policy Optimization for Direct Adaptive Learning of the LQR
Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven LQR problem, which is shown to be equivalent to the certainty-equivalence LQR with optimal non-asymptotic guarantees. Second, we design a novel data-enabled policy optimization (DeePO) method to directly update the policy, where the gradient is explicitly computed using only a batch of persistently exciting (PE) data. Third, we establish its global convergence via a projected gradient dominance property. Importantly, we efficiently use DeePO to adaptively learn the LQR by performing only one-step projected gradient descent per sample of the closed-loop system, which also leads to an explicit recursive update of the policy. Under PE inputs and for bounded noise, we show that the average regret of the LQR cost is upper-bounded by two terms signifying a sublinear decrease in time $\mathcal{O}(1/\sqrt{T})$ plus a bias scaling inversely with signal-to-noise ratio (SNR), which are independent of the noise statistics. Finally, we perform simulations to validate the theoretical results and demonstrate the computational and sample efficiency of our method.
comment: Submitted to IEEE TAC
Distributed Data-driven Unknown-input Observers for State Estimation
Unknown inputs related to, e.g., sensor aging, modeling errors, or device bias, represent a major concern in wireless sensor networks, as they degrade the state estimation performance. To improve the performance, unknown-input observers (UIOs) have been proposed. Most of the results available to design UIOs are based on explicit system models, which can be difficult or impossible to obtain in real-world applications. Data-driven techniques, on the other hand, have become a viable alternative for the design and analysis of unknown systems using only data. In this context, a novel data-driven distributed unknown-input observer (D-DUIO) for unknown continuous-time linear time-invariant (LTI) systems is developed, which requires solely some data collected offline, without any prior knowledge of the system matrices. In the paper, first, a model-based approach to the design of a DUIO is presented. A sufficient condition for the existence of such a DUIO is recalled, and a new one is proposed, that is prone to a data-driven adaption. Moving to a data-driven approach, it is shown that under suitable assumptions on the input/output/state data collected from the continuous-time system, it is possible to both claim the existence of a D-DUIO and to derive its matrices in terms of the matrices of pre-collected data. Finally, the efficacy of the D-DUIO is illustrated by means of numerical examples.
Quantifying Aleatoric and Epistemic Dynamics Uncertainty via Local Conformal Calibration
Whether learned, simulated, or analytical, approximations of a robot's dynamics can be inaccurate when encountering novel environments. Many approaches have been proposed to quantify the aleatoric uncertainty of such methods, i.e. uncertainty resulting from stochasticity, however these estimates alone are not enough to properly estimate the uncertainty of a model in a novel environment, where the actual dynamics can change. Such changes can induce epistemic uncertainty, i.e. uncertainty due to a lack of information/data. Accounting for both epistemic and aleatoric dynamics uncertainty in a theoretically-grounded way remains an open problem. We introduce Local Uncertainty Conformal Calibration (LUCCa), a conformal prediction-based approach that calibrates the aleatoric uncertainty estimates provided by dynamics models to generate probabilistically-valid prediction regions of the system's state. We account for both epistemic and aleatoric uncertainty non-asymptotically, without strong assumptions about the form of the true dynamics or how it changes. The calibration is performed locally in the state-action space, leading to uncertainty estimates that are useful for planning. We validate our method by constructing probabilistically-safe plans for a double-integrator under significant changes in dynamics.
comment: Accepted to the 16th International Workshop on the Algorithmic Foundations of Robotics (WAFR) 2024
SustainDC -- Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Systems and Control (EESS)
On the Cost of Consecutive Estimation Error: Significance-Aware Non-linear Aging
This paper considers the semantics-aware remote state estimation of an asymmetric Markov chain with prioritized states. Due to resource constraints, the sensor needs to trade between estimation quality and communication cost. The aim is to exploit the significance of information through the history of system realizations to determine the optimal timing of transmission, thereby reducing the amount of uninformative data transmitted in the network. To this end, we introduce a new metric, the significance-aware Age of Consecutive Error (AoCE), that captures two semantic attributes: the significance of estimation error and the cost of consecutive error. Different costs and non-linear age functions are assigned to different estimation errors to account for their relative importance to system performance. We identify the optimal transmission problem as a countably infinite state Markov decision process (MDP) with unbounded costs. We first give sufficient conditions on the age functions, source pattern, and channel reliability so that an optimal policy exists to have bounded average costs. We show that the optimal policy exhibits a switching structure. That is, the sensor triggers a transmission only when the system has been trapped in an error for a certain number of consecutive time slots. We also provide sufficient conditions under which the switching policy degenerates into a simple threshold policy, i.e., featuring identical thresholds for all estimation errors. Furthermore, we exploit the structural properties and develop a structured policy iteration (SPI) algorithm that considerably reduces computation overhead. Numerical results show that the optimal policy outperforms the classic rule-, distortion- and age-based policies. An important takeaway is that the more semantic attributes we utilize, the fewer transmissions are needed.
comment: This paper has been submitted for possible publication
HiL Demonstration of Online Battery Capacity and Impedance Estimation with Minimal a Priori Parametrization Effort
Uncertainty in the aging of batteries in battery electric vehicles impacts both the daily driving range as well as the expected economic lifetime. This paper presents a method to determine online the capacity and internal resistance of a battery cell based on real-world data. The method, based on a Joint Extended Kalman Filter combined with Recursive Least Squares, is computationally efficient and does not a priori require a fully characterized cell model. Offline simulation of the algorithm on data from differently aged cells shows convergence of the algorithm and indicates that capacity and resistance follow the expected trends. Furthermore, the algorithm is tested online on a Hardware-in-the-Loop setup to demonstrate real-time parameter updates in a realistic driving scenario.
comment: 6 pages, 9 figures, to be presented at VPPC 2024
Attainable Force Approximation and Full-Pose Tracking Control of an Over-Actuated Thrust-Vectoring Modular Team UAV
Traditional vertical take-off and landing (VTOL) aircraft can not achieve optimal efficiency for various payload weights and has limited mobility due to its under-actuation. With the thrust-vectoring mechanism, the proposed modular team UAV is fully actuated at certain attitudes. However, the attainable force space (AFS) differs according to the team configuration, which makes the controller design difficult. We propose an approximation to the AFS and a full-pose tracking controller with an attitude planner and a force projection, which guarantees the control force is feasible. The proposed approach can be applied to UAVs having multiple thrust-vectoring effectors with homogeneous agents. The simulation and experiment demonstrate a tilting motion during hovering for a 4-agent team.
A 9T4R RRAM-Based ACAM for Analogue Template Matching at the Edge
The continuous shift of computational bottlenecks to the memory access and data transfer, especially for AI applications, poses the urgent needs of re-engineering the computer architecture fundamentals. Many edge computing applications, like wearable and implantable medical devices, introduce increasingly more challenges to conventional computing systems due to the strict requirements of area and power at the edge. Emerging technologies, like Resistive RAM (RRAM), have shown a promising momentum in developing neuro-inspired analogue computing paradigms capable of achieving high classification capabilities alongside high energy efficiency. In this work, we present a novel RRAM-based Analogue Content Addressable Memory (ACAM) for on-line analogue template matching applications. This ACAM-based template matching architecture aims to achieve energy-efficient classification where low energy is of utmost importance. We are showcasing a highly tuneable novel RRAM-based ACAM pixel implemented using a commercial 180nm CMOS technology and in-house RRAM technology and exhibiting low energy dissipation of approximately 0.036pJ and 0.16pJ for mismatch and match, respectively, at 66MHz with 3V voltage supply. A proof-of-concept system-level implementation based on this novel pixel design is also implemented in 180nm.
Strategic Utilization of Cellular Operator Energy Storages for Smart Grid Frequency Regulation
The innovative use of cellular operator energy storage enhances smart grid resilience and efficiency. Traditionally used to ensure uninterrupted operation of cellular base stations (BSs) during grid outages, these storages can now dynamically participate in the energy flexibility market. This dual utilization enhances the economic viability of BS storage systems and supports sustainable energy management. In this paper, we explore the potential of BS storages for supporting grid ancillary services by allocating a portion of their capacity while ensuring Ultra Reliable Low Latency (URLLC) requirements, such as meeting delay and reliability requirements. This includes feeding BS stored energy back into the grid during high-demand periods or powering BSs to regulate grid frequency. We investigate the impacts of URLLC requirements on grid frequency regulation, formulating a joint resource allocation problem. This problem maximizes total revenues of cellular networks, considering both the total sum rate in the communication network and BS storages participation in frequency regulation, while considering battery aging and cycling constraints. Simulation results show that a network with 1500 BSs can increase power vacancy compensation from 31% to 46% by reducing reliability from 10^(-8) to 10^(-3). For a power vacancy of -30 MW, this varies from 9.3 MW to 13.5 MW, exceeding a wind turbines capacity.
Large Synthetic Datasets for Machine Learning Applications in Power Systems
With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access. This manuscript describes an algorithmic approach for generating large datasets of power injections in electric power grids. The method allows one to generate arbitrarily large time series from the knowledge of the grid -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated power consumption data, such as the national load data given by ENTSO-E. The obtained datasets are statistically validated against real-world data.
comment: 15 pages, 8 figures, 2 tables. Dataset available at https://zenodo.org/records/13378476
Enhanced Transformer architecture for in-context learning of dynamical systems
Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model is fed with an observed input/output sequence (context) generated by a real system to predict its behavior in a zero-shot learning fashion. In this paper, we enhance the original meta-modeling framework through three key innovations: by formulating the learning task within a probabilistic framework; by managing non-contiguous context and query windows; and by adopting recurrent patching to effectively handle long context sequences. The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class, highlighting the model's enhanced performance and scalability.
Simulated Eyeblink Artifact Removal with ICA: Effect of Measurement Uncertainty
Independent Component Analysis (ICA) is commonly-used in electroencephalogram (EEG) signal processing to remove non-cerebral artifacts from cerebral data. Despite the ubiquity of ICA, the effect of measurement uncertainty on the artifact removal process has not been thoroughly investigated. We first characterize the measurement uncertainty distribution of a common ADC and show that it quantitatively conforms to a Gaussian distribution. We then evaluate the effect of measurement uncertainty on the artifact identification process through several computer simulations. These computer simulations evaluate the performance of two different ICA algorithms, FastICA and Infomax, in removing eyeblink artifacts from five different electrode configurations with varying levels of measurement uncertainty. FastICA and Infomax show similar performance in identifying the eyeblink artifacts for a given uncertainty level and electrode configuration. We quantify the correlation performance degradation with respect to SNR and show that in general, an SNR of greater than 15 dB results in less than a 5% degradation in performance. The biggest difference in performance between the two algorithms is in their execution time. FastICA's execution time is dependent on the amount of measurement uncertainty, with a 50% to 85% reduction in execution time over an SNR range of 20 dB. This contrasts with Infomax's execution time, which is unaffected by measurement uncertainty.
comment: 8 pages, 9 figures
Online Bandit Nonlinear Control with Dynamic Batch Length and Adaptive Learning Rate
This paper is concerned with the online bandit nonlinear control, which aims to learn the best stabilizing controller from a pool of stabilizing and destabilizing controllers of unknown types for a given nonlinear dynamical system. We develop an algorithm, named Dynamic Batch length and Adaptive learning Rate (DBAR), and study its stability and regret. Unlike the existing Exp3 algorithm requiring an exponentially stabilizing controller, DBAR only needs a significantly weaker notion of controller stability, in which case substantial time may be required to certify the system stability. Dynamic batch length in DBAR effectively addresses this issue and enables the system to attain asymptotic stability, where the algorithm behaves as if there were no destabilizing controllers. Moreover, adaptive learning rate in DBAR only uses the state norm information to achieve a tight regret bound even when none of the stabilizing controllers in the pool are exponentially stabilizing.
comment: 38 pages, 7 figures
Optimized Topology Control for IoT Networks using Graph-based Localization
The key research question we are addressing in this paper, is how local distance information can be integrated into the global structure determination, in the form of network graphs realization for IoT networks. IoT networks will be pervading every walk of life over the next few years with the aim of improving quality of life and enhancing surrounding living conditions, while balancing available resources, like energy and computational power. As we deal with massive number of heterogeneous devices contributing to each IoT network, it is of paramount importance that the IoT network topology can be designed and controlled in such a way that coverage and throughput can be maximized using a minimum number of devices, while tackling challenges like poor link quality and interference. We tackle the above-mentioned problem of topology design and control through our designed graph-realization concept. End-nodes and gateways are identified and placed within neighborhood sub-graphs and their own coordinate system, which are stitched together to form the global graph. The stitching is done in a way that transmit power and information rate are optimized while reducing error probability.
Optimal Control in Both Steady State and Transient Process with Unknown Disturbances
The scheme of online optimization as a feedback controller is widely used to steer the states of a physical system to the optimal solution of a predefined optimization problem. Such methods focus on regulating the physical states to the optimal solution in the steady state, without considering the performance during the transient process. In this paper, we simultaneously consider the performance in both the steady state and the transient process of a linear time-invariant system with unknown disturbances. The performance of the transient process is illustrated by the concept of overtaking optimality. An overtaking optimal controller with known disturbances is derived to achieve the transient overtaking optimality while guaranteeing steady-state performance. Then, we propose a disturbance independent near-optimal controller, which can achieve optimal steady-state performance and approach the overtaking optimal performance in the transient process. The system performance gap between the overtaking optimal controller and the proposed controller proves to be inversely proportional to the control gains. A case study on a power system with four buses is used to validate the effectiveness of the two controllers.
A Policy Iteration Algorithm for N-player General-Sum Linear Quadratic Dynamic Games
We present a policy iteration algorithm for the infinite-horizon N-player general-sum deterministic linear quadratic dynamic games and compare it to policy gradient methods. We demonstrate that the proposed policy iteration algorithm is distinct from the Gauss-Newton policy gradient method in the N-player game setting, in contrast to the single-player setting where under suitable choice of step size they are equivalent. We illustrate in numerical experiments that the convergence rate of the proposed policy iteration algorithm significantly surpasses that of the Gauss-Newton policy gradient method and other policy gradient variations. Furthermore, our numerical results indicate that, compared to policy gradient methods, the convergence performance of the proposed policy iteration algorithm is less sensitive to the initial policy and changes in the number of players.
Optimization Proxies using Limited Labeled Data and Training Time -- A Semi-Supervised Bayesian Neural Network Approach
Constrained optimization problems arise in various engineering system operations such as inventory management and electric power grids. However, the requirement to repeatedly solve such optimization problems with uncertain parameters poses a significant computational challenge. This work introduces a learning scheme using Bayesian Neural Networks (BNNs) to solve constrained optimization problems under limited labeled data and restricted model training times. We propose a semi-supervised BNN for this practical but complex regime, wherein training commences in a sandwiched fashion, alternating between a supervised learning step (using labeled data) for minimizing cost, and an unsupervised learning step (using unlabeled data) for enforcing constraint feasibility. Both supervised and unsupervised steps use a Bayesian approach, where Stochastic Variational Inference is employed for approximate Bayesian inference. We show that the proposed semi-supervised learning method outperforms conventional BNN and deep neural network (DNN) architectures on important non-convex constrained optimization problems from energy network operations, achieving up to a tenfold reduction in expected maximum equality gap and halving the optimality and inequality (feasibility) gaps, without requiring any correction or projection step. By leveraging the BNN's ability to provide posterior samples at minimal computational cost, we demonstrate that a Selection via Posterior (SvP) scheme can further reduce equality gaps by more than 10%. We also provide tight and practically meaningful probabilistic confidence bounds that can be constructed using a low number of labeled testing data and readily adapted to other applications.
LEGO: QEC Decoding System Architecture for Dynamic Circuits
Quantum error correction (QEC) is a critical component of FTQC; the QEC decoder is an important part of Classical Computing for Quantum or C4Q. Recent years have seen fast development in real-time QEC decoders. Existing efforts to build real-time decoders have yet to achieve a critical milestone: decoding dynamic logical circuits with error-corrected readout and feed forward. Achieving this requires significant engineering effort to adapt and reconfigure the decoders during runtime, depending on the branching of the logical circuit. We present a QEC decoder architecture called LEGO, with the ambitious goal of supporting dynamic logical operations. LEGO employs a novel abstraction called the decoding block to describe the decoding problem of a dynamic logical circuit. Moreover, decoding blocks can be combined with three other ideas to improve the efficiency, accuracy and latency of the decoder. First, they provide data and task parallelisms when combined with fusion-based decoding. Second, they can exploit the pipeline parallelism inside multi-stage decoders. Finally, they serve as basic units of work for computational resource management. Using decoding blocks, LEGO can be easily reconfigured to support all QEC settings and to easily accommodate innovations in three interdependent fields: code, logical operations and qubit hardware. In contrast, existing decoders are highly specialized to a specific QEC setting, which leads to redundant research and engineering efforts, slows down innovation, and further fragments the nascent quantum computing industry.
Geometric Collaborative Filtering with Convergence
Latent variable collaborative filtering methods have been a standard approach to modelling user-click interactions due to their simplicity and effectiveness. However, there is limited work on analyzing the mathematical properties of these methods in particular on preventing the overfitting towards the identity, and such methods typically utilize loss functions that overlook the geometry between items. In this work, we introduce a notion of generalization gap in collaborative filtering and analyze this with respect to latent collaborative filtering models. We present a geometric upper bound that gives rise to loss functions, and a way to meaningfully utilize the geometry of item-metadata to improve recommendations. We show how these losses can be minimized and gives the recipe to a new latent collaborative filtering algorithm, which we refer to as GeoCF, due to the geometric nature of our results. We then show experimentally that our proposed GeoCF algorithm can outperform other all existing methods on the Movielens20M and Netflix datasets, as well as two large-scale internal datasets. In summary, our work proposes a theoretically sound method which paves a way to better understand generalization of collaborative filtering at large.
comment: 13 pages, 1 figure, 3 tables
Online Control-Informed Learning
This paper proposes an Online Control-Informed Learning (OCIL) framework, which synthesizes the well-established control theories to solve a broad class of learning and control tasks in real time. This novel integration effectively handles practical issues in machine learning such as noisy measurement data, online learning, and data efficiency. By considering any robot as a tunable optimal control system, we propose an online parameter estimator based on extended Kalman filter (EKF) to incrementally tune the system in real time, enabling it to complete designated learning or control tasks. The proposed method also improves robustness in learning by effectively managing noise in the data. Theoretical analysis is provided to demonstrate the convergence and regret of OCIL. Three learning modes of OCIL, i.e. Online Imitation Learning, Online System Identification, and Policy Tuning On-the-fly, are investigated via experiments, which validate their effectiveness.
A Machine Learning-Based Reference Governor for Nonlinear Systems With Application to Automotive Fuel Cells
The prediction-based nonlinear reference governor (PRG) is an add-on algorithm to enforce constraints on pre-stabilized nonlinear systems by modifying, whenever necessary, the reference signal. The implementation of PRG carries a heavy computational burden, as it may require multiple numerical simulations of the plant model at each sample time. To this end, this paper proposes an alternative approach based on machine learning, where we first use a regression neural network (NN) to approximate the input-output map of the PRG from a set of training data. During the real-time operation, at each sample time, we use the trained NN to compute a nominal reference command, which may not be constraint admissible due to training errors and limited data. We adopt a novel sensitivity-based approach to minimally adjust the nominal reference while ensuring constraint enforcement. We thus refer to the resulting control strategy as the modified neural network reference governor (MNN-RG), which is significantly more computationally efficient than the PRG. The computational and theoretical properties of MNN-RG are presented. Finally, the effectiveness and limitations of the proposed method are studied by applying it as a load governor for constraint management in automotive fuel cell systems through simulation-based case studies.
Safe Reference Tracking and Collision Avoidance for Taxiing Aircraft Using an MPC-CBF Framework
In this paper, we develop a framework for the automatic taxiing of aircraft between hangar and take-off given a graph-based model of an airport. We implement a high-level path-planning algorithm that models taxiway intersections as nodes in an undirected graph, algorithmically constructs a directed graph according to the physical limitations of the aircraft, and finds the shortest valid taxi path through the directed graph using Dijkstra's algorithm. We then use this shortest path to construct a reference trajectory for the aircraft to follow that considers the turning capabilities of a given aircraft. Using high-order control barrier functions (HOCBFs), we construct safety conditions for multi-obstacle avoidance and safe reference tracking for simple 2D unicycle dynamics with acceleration control inputs. We then use these safety conditions to design an MPC-CBF framework that tracks the reference trajectory while adhering to the safety constraints. We compare the performance of our MPC-CBF controller with a PID-CBF control method via simulations.
comment: This work is under review to be presented at the 2025 American Control Conference
Collaborative Safety-Critical Formation Control with Obstacle Avoidance
This work explores a collaborative method for ensuring safety in multi-agent formation control problems. We formulate a control barrier function (CBF) based safety filter control law for a generic distributed formation controller and extend our previously developed collaborative safety framework to an obstacle avoidance problem for agents with acceleration control inputs. We then incorporate multi-obstacle collision avoidance into the collaborative safety framework. This framework includes a method for computing the maximum capability of agents to satisfy their individual safety requirements. We analyze the convergence rate of our collaborative safety algorithm, and prove the linear-time convergence of cooperating agents to a jointly feasible safe action for all agents under the special case of a tree-structured communication network with a single obstacle for each agent. We illustrate the analytical results via simulation on a mass-spring kinematics-based formation controller and demonstrate the finite-time convergence of the collaborative safety algorithm in the simple proven case, the more general case of a fully-connected system with multiple static obstacles, and with dynamic obstacles.
comment: This work is under review for publication in Automatica. arXiv admin note: text overlap with arXiv:2311.11156
Universal Global State Estimation for Inertial Navigation Systems
This paper addresses the problem of accurate pose estimation (position, velocity, and orientation) for a rigid body. By utilizing generic exteroceptive measurements in combination with an Inertial Measurement Unit (IMU), we reformulate the vehicle's dynamics and outputs to fit within a linear time-varying (LTV) framework. This transformation enables the application of a linear continuous-time Kalman filter, thereby avoiding the complexities of nonlinear estimators and local Kalman-type filtering methods (e.g., EKF). We perform a complete uniform observability analysis for key benchmark problems (e.g., GPS-INS and Landmark-INS) and derive sufficient conditions for ensuring global uniform exponential stability. Simulations are conducted for two practical applications: stereo-aided inertial navigation systems (INS) with both constant and time-varying gains, as well as GPS-aided INS. The proposed approach notably simplifies observer design for INS.
comment: 8 pages
Sim-to-Real Multirotor Controller Single-shot Learning
This paper demonstrates the sim-to-real capabilities of retrospective cost optimization-based adaptive control for multirotor stabilization and trajectory-tracking problems. First, a continuous-time version of the widely used discrete-time retrospective control adaptive control algorithm is developed. Next, a computationally inexpensive 12-degree-of-freedom model of a multirotor is used to learn the control system in a simulation environment with a single trajectory. Finally, the performance of the learned controller is verified in a complex and realistic multirotor model in simulation and with a physical quadcopter in a waypoint command and a helical trajectory command.
Enhanced Digital Twin for Human-Centric and Integrated Lighting Asset Management in Public Libraries: From Corrective to Predictive Maintenance
Lighting asset management in public libraries has traditionally been reactive, focusing on corrective maintenance, addressing issues only when failures occur. Although standards now encourage preventive measures, such as incorporating a maintenance factor, the broader goal of human centric, sustainable lighting systems requires a shift toward predictive maintenance strategies. This study introduces an enhanced digital twin model designed for the proactive management of lighting assets in public libraries. By integrating descriptive, diagnostic, predictive, and prescriptive analytics, the model enables a comprehensive, multilevel view of asset health. The proposed framework supports both preventive and predictive maintenance strategies, allowing for early detection of issues and the timely resolution of potential failures. In addition to the specific application for lighting systems, the design is adaptable for other building assets, providing a scalable solution for integrated asset management in various public spaces.
Probabilistic forecasting of power system imbalance using neural network-based ensembles
Keeping the balance between electricity generation and consumption is becoming increasingly challenging and costly, mainly due to the rising share of renewables, electric vehicles and heat pumps and electrification of industrial processes. Accurate imbalance forecasts, along with reliable uncertainty estimations, enable transmission system operators (TSOs) to dispatch appropriate reserve volumes, reducing balancing costs. Further, market parties can use these probabilistic forecasts to design strategies that exploit asset flexibility to help balance the grid, generating revenue with known risks. Despite its importance, literature regarding system imbalance (SI) forecasting is limited. Further, existing methods do not focus on situations with high imbalance magnitude, which are crucial to forecast accurately for both TSOs and market parties. Hence, we propose an ensemble of C-VSNs, which are our adaptation of variable selection networks (VSNs). Each minute, our model predicts the imbalance of the current and upcoming two quarter-hours, along with uncertainty estimations on these forecasts. We evaluate our approach by forecasting the imbalance of Belgium, where high imbalance magnitude is defined as $|$SI$| > 500\,$MW (occurs 1.3% of the time in Belgium). For high imbalance magnitude situations, our model outperforms the state-of-the-art by 23.4% (in terms of continuous ranked probability score (CRPS), which evaluates probabilistic forecasts), while also attaining a 6.5% improvement in overall CRPS. Similar improvements are achieved in terms of root-mean-squared error. Additionally, we developed a fine-tuning methodology to effectively include new inputs with limited history in our model. This work was performed in collaboration with Elia (the Belgian TSO) to further improve their imbalance forecasts, demonstrating the relevance of our work.
Parallelized Robust Distributed Model Predictive Control in the Presence of Coupled State Constraints
In this paper, we present a robust distributed model predictive control (DMPC) scheme for dynamically decoupled nonlinear systems which are subject to state constraints, coupled state constraints and input constraints. In the proposed control scheme, all subsystems solve their local optimization problem in parallel and neighbor-to-neighbor communication suffices. The approach relies on consistency constraints which define a neighborhood around each subsystem's reference trajectory where the state of the subsystem is guaranteed to stay in. Reference trajectories and consistency constraints are known to neighboring subsystems. Contrary to other relevant approaches, the reference trajectories are improved consecutively. The presented approach allows the formulation of convex optimization problems for systems with linear dynamics even in the presence of non-convex state constraints. Additionally, we employ tubes in order to ensure the controller's robustness against bounded uncertainties. In the end, we briefly comment on an iterative extension of the DMPC scheme. The effectiveness of the proposed DMPC scheme and its iterative extension are demonstrated with simulations.
comment: 16 pages, 5 figures, preprint to be published in Automatica
Data-Enabled Policy Optimization for Direct Adaptive Learning of the LQR
Direct data-driven design methods for the linear quadratic regulator (LQR) mainly use offline or episodic data batches, and their online adaptation has been acknowledged as an open problem. In this paper, we propose a direct adaptive method to learn the LQR from online closed-loop data. First, we propose a new policy parameterization based on the sample covariance to formulate a direct data-driven LQR problem, which is shown to be equivalent to the certainty-equivalence LQR with optimal non-asymptotic guarantees. Second, we design a novel data-enabled policy optimization (DeePO) method to directly update the policy, where the gradient is explicitly computed using only a batch of persistently exciting (PE) data. Third, we establish its global convergence via a projected gradient dominance property. Importantly, we efficiently use DeePO to adaptively learn the LQR by performing only one-step projected gradient descent per sample of the closed-loop system, which also leads to an explicit recursive update of the policy. Under PE inputs and for bounded noise, we show that the average regret of the LQR cost is upper-bounded by two terms signifying a sublinear decrease in time $\mathcal{O}(1/\sqrt{T})$ plus a bias scaling inversely with signal-to-noise ratio (SNR), which are independent of the noise statistics. Finally, we perform simulations to validate the theoretical results and demonstrate the computational and sample efficiency of our method.
comment: Submitted to IEEE TAC
Distributed Data-driven Unknown-input Observers for State Estimation
Unknown inputs related to, e.g., sensor aging, modeling errors, or device bias, represent a major concern in wireless sensor networks, as they degrade the state estimation performance. To improve the performance, unknown-input observers (UIOs) have been proposed. Most of the results available to design UIOs are based on explicit system models, which can be difficult or impossible to obtain in real-world applications. Data-driven techniques, on the other hand, have become a viable alternative for the design and analysis of unknown systems using only data. In this context, a novel data-driven distributed unknown-input observer (D-DUIO) for unknown continuous-time linear time-invariant (LTI) systems is developed, which requires solely some data collected offline, without any prior knowledge of the system matrices. In the paper, first, a model-based approach to the design of a DUIO is presented. A sufficient condition for the existence of such a DUIO is recalled, and a new one is proposed, that is prone to a data-driven adaption. Moving to a data-driven approach, it is shown that under suitable assumptions on the input/output/state data collected from the continuous-time system, it is possible to both claim the existence of a D-DUIO and to derive its matrices in terms of the matrices of pre-collected data. Finally, the efficacy of the D-DUIO is illustrated by means of numerical examples.
Quantifying Aleatoric and Epistemic Dynamics Uncertainty via Local Conformal Calibration
Whether learned, simulated, or analytical, approximations of a robot's dynamics can be inaccurate when encountering novel environments. Many approaches have been proposed to quantify the aleatoric uncertainty of such methods, i.e. uncertainty resulting from stochasticity, however these estimates alone are not enough to properly estimate the uncertainty of a model in a novel environment, where the actual dynamics can change. Such changes can induce epistemic uncertainty, i.e. uncertainty due to a lack of information/data. Accounting for both epistemic and aleatoric dynamics uncertainty in a theoretically-grounded way remains an open problem. We introduce Local Uncertainty Conformal Calibration (LUCCa), a conformal prediction-based approach that calibrates the aleatoric uncertainty estimates provided by dynamics models to generate probabilistically-valid prediction regions of the system's state. We account for both epistemic and aleatoric uncertainty non-asymptotically, without strong assumptions about the form of the true dynamics or how it changes. The calibration is performed locally in the state-action space, leading to uncertainty estimates that are useful for planning. We validate our method by constructing probabilistically-safe plans for a double-integrator under significant changes in dynamics.
comment: Accepted to the 16th International Workshop on the Algorithmic Foundations of Robotics (WAFR) 2024
SustainDC -- Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Systems and Control (CS)
Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments
Navigating complex environments requires Unmanned Aerial Vehicles (UAVs) and autonomous systems to perform trajectory tracking and obstacle avoidance in real-time. While many control strategies have effectively utilized linear approximations, addressing the non-linear dynamics of UAV, especially in obstacle-dense environments, remains a key challenge that requires further research. This paper introduces a Non-linear Model Predictive Control (NMPC) framework for the DJI Matrice 100, addressing these challenges by using a dynamic model and B-spline interpolation for smooth reference trajectories, ensuring minimal deviation while respecting safety constraints. The framework supports various trajectory types and employs a penalty-based cost function for control accuracy in tight maneuvers. The framework utilizes CasADi for efficient real-time optimization, enabling the UAV to maintain robust operation even under tight computational constraints. Simulation and real-world indoor and outdoor experiments demonstrated the NMPC ability to adapt to disturbances, resulting in smooth, collision-free navigation.
comment: This manuscript has 7 pages and 8 figures, detailing NMPC for UAV obstacle avoidance using DJI UAVs. It features simulations, experimental results, and uses CasADi for optimization with ROS integration. Code and media at https://github.com/larasupernovae/nmpc_flash_multi_obstacle
Numerical optimal control for delay differential equations: A simultaneous approach based on linearization of the delayed state
Time delays are ubiquitous in industry, and they must be accounted for when designing control strategies. However, numerical optimal control (NOC) of delay differential equations (DDEs) is challenging because it requires specialized discretization methods and the time delays may depend on the manipulated inputs or state variables. Therefore, in this work, we propose to linearize the delayed states around the current time. This results in a set of implicit differential equations, and we compare the steady states and the corresponding stability criteria of the DDEs and the approximate system. Furthermore, we propose a simultaneous approach for NOC of DDEs based on the linearization, and we discretize the approximate system using Euler's implicit method. Finally, we present a numerical example involving a molten salt nuclear fission reactor.
comment: 6 pages, 4 figures, submitted to a conference
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
comment: 16 pages, 17 figures
Toward Neuronal Implementations of Delayed Optimal Control
Animal sensorimotor behavior is frequently modeled using optimal controllers. However, it is unclear how the neuronal circuits within the animal's nervous system implement optimal controller-like behavior. In this work, we study the question of implementing a delayed linear quadratic regulator with linear dynamical "neurons" on a muscle model. We show that for any second-order controller, there are three minimal neural circuit configurations that implement the same controller. Furthermore, the firing rate characteristics of each circuit can vary drastically, even as the overall controller behavior is preserved. Along the way, we introduce concepts that bridge controller realizations to neural implementations that are compatible with known neuronal delay structures.
comment: Submitted to IEEE American Control Conference
Automated Music Therapy for Anxiety and Depression Management in Older People (AMITY)
The onset of old age brings physiological and mental changes, with anxiety and depression being common mental disorders that can trigger other health issues and reduce lifespan. However, due to a global shortage of mental health professionals, combined with a growing population and limited awareness, these disorders often go undiagnosed. Music therapy offers a reliable method to address psychological, emotional, and cognitive needs. This paper presents an approach that monitors anxiety and depression symptoms in real time using low-complexity body sensors, followed by automated personalised music therapy, reducing the dependence on therapists and improving mental health care accessibility.
comment: 10 pages, 5 figures
SwarmCVT: Centroidal Voronoi Tessellation-Based Path Planning for Very-Large-Scale Robotics
Swarm robotics, or very large-scale robotics (VLSR), has many meaningful applications for complicated tasks. However, the complexity of motion control and energy costs stack up quickly as the number of robots increases. In addressing this problem, our previous studies have formulated various methods employing macroscopic and microscopic approaches. These methods enable microscopic robots to adhere to a reference Gaussian mixture model (GMM) distribution observed at the macroscopic scale. As a result, optimizing the macroscopic level will result in an optimal overall result. However, all these methods require systematic and global generation of Gaussian components (GCs) within obstacle-free areas to construct the GMM trajectories. This work utilizes centroidal Voronoi tessellation to generate GCs methodically. Consequently, it demonstrates performance improvement while also ensuring consistency and reliability.
comment: Submitted to American Control Conference (ACC) 2025
Behavior Trees in Functional Safety Supervisors for Autonomous Vehicles
The rapid advancements in autonomous vehicle software present both opportunities and challenges, especially in enhancing road safety. The primary objective of autonomous vehicles is to reduce accident rates through improved safety measures. However, the integration of new algorithms into the autonomous vehicle, such as Artificial Intelligence methods, raises concerns about the compliance with established safety regulations. This paper introduces a novel software architecture based on behavior trees, aligned with established standards and designed to supervise vehicle functional safety in real time. It specifically addresses the integration of algorithms into industrial road vehicles, adhering to the ISO 26262. The proposed supervision methodology involves the detection of hazards and compliance with functional and technical safety requirements when a hazard arises. This methodology, implemented in this study in a Renault M\'egane (currently at SAE level 3 of automation), not only guarantees compliance with safety standards, but also paves the way for safer and more reliable autonomous driving technologies.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Load Balancing-based Topology Adaptation for Integrated Access and Backhaul Networks
Integrated access and backhaul (IAB) technology is a flexible solution for network densification. IAB nodes can also be deployed in moving nodes such as buses and trains, i.e., mobile IAB (mIAB). As mIAB nodes can move around the coverage area, the connection between mIAB nodes and their parent macro base stations (BSs), IAB donor, is sometimes required to change in order to keep an acceptable backhaul link, the so called topology adaptation (TA). The change from one IAB donor to another may strongly impact the system load distribution, possibly causing unsatisfactory backhaul service due to the lack of radio resources. Based on this, TA should consider both backhaul link quality and traffic load. In this work, we propose a load balancing algorithm based on TA for IAB networks, and compare it with an approach in which TA is triggered based on reference signal received power (RSRP) only. The results show that our proposed algorithm improves the passengers worst connections throughput in uplink (UL) and, more modestly, also in downlink (DL), without impairing the pedestrian quality of service (QoS) significantly.
comment: Paper submitted to Journal of Communication and Information Systems (JCIS)
Cellular Network Densification: a System-level Analysis with IAB, NCR and RIS
As the number of user equipments increases in fifth generation (5G) and beyond, it is desired to densify the cellular network with auxiliary nodes assisting the base stations. Examples of these nodes are integrated access and backhaul (IAB) nodes, network-controlled repeaters (NCRs) and reconfigurable intelligent surfaces (RISs). In this context, this work presents a system level overview of these three nodes. Moreover, this work evaluates through simulations the impact of network planning aiming at enhancing the performance of a network used to cover an outdoor sport event. We show that, in the considered scenario, in general, IAB nodes provide an improved signal to interference-plus-noise ratio and throughput, compared to NCRs and RISs. However, there are situations where NCR outperforms IAB due to higher level of interference caused by the latter. Finally, we show that the deployment of these nodes in unmanned aerial vehicles (UAVs) also achieves performance gains due to their aerial mobility. However, UAV constraints related to aerial deployment may prevent these nodes from reaching results as good as the ones achieved by their stationary deployment.
comment: Paper submitted to IEEE Systems Journal
Cross-Domain Comparative Analysis of Digital Twins and Universalised Solutions
Digitalisation is one of the main drivers of most economic sectors nowadays and the digital twin, as a reification of digitalisation for complex systems has attracted much attention from both academics and industry. There have been studies focusing on digital twins in a specific sector while there are few exercising insightful comparisons of digital twins from different domains. Considering the digital twinning is a cross-domain transformation, it is beneficial to establish the principles of universality and variation that can explain similarities and differences in any digital twins. This paper first delivers a comparative analysis of digital twins in five domains through a six-dimensional characterisation framework. Then, by departing from the correlations among the domain-specific DT development, a cross-domain Digital Twin Platform-as-a-Service (DT-PaaS) is proposed to universalise the common process, tools and applications, meanwhile being inclusive of variations of every digital twin instance. As a centralised data, modeling and service platform, it is expected to break the barriers between domains by enabling the cross-domain digital twin data sharing, interoperability and development synergy and tackle some complex global challenges such as climate challenge, net zero, pandemics, etc.
Equivalence between Geometric Frequency and Lagrange Derivative
The paper shows the equivalence between the geometric frequency of an electric quantity, namely, voltage and current, and the Lagrange derivative of a stream-line of a fluid. The geometric frequency is a concept recently proposed by the author and is a generalization of the instantaneous frequency, a quantity that is particularly important for the analysis and the control of electric power systems. On the other hand, the Lagrange derivative is mostly utilized in fluid dynamics and helps decomposing the time derivative into various components. The paper shows how these components relate to the elements of the geometric frequency. The paper also shows, through a variety of numerical examples, how the decomposition of the Lagrange derivative helps identifying the distortion of the waveform of a measured electric quantity and how this information can be utilized to classify system operating conditions.
Semantic Communication and Control Co-Design for Multi-Objective Correlated Dynamics
This letter introduces a machine-learning approach to learning the semantic dynamics of correlated systems with different control rules and dynamics. By leveraging the Koopman operator in an autoencoder (AE) framework, the system's state evolution is linearized in the latent space using a dynamic semantic Koopman (DSK) model, capturing the baseline semantic dynamics. Signal temporal logic (STL) is incorporated through a logical semantic Koopman (LSK) model to encode system-specific control rules. These models form the proposed logical Koopman AE framework that reduces communication costs while improving state prediction accuracy and control performance, showing a 91.65% reduction in communication samples and significant performance gains in simulation.
Optimal $H_{\infty}$ control based on stable manifold of discounted Hamilton-Jacobi-Isaacs equation
The optimal \(H_{\infty}\) control problem over an infinite time horizon, which incorporates a performance function with a discount factor \(e^{-\alpha t}\) (\(\alpha > 0\)), is important in various fields. Solving this optimal \(H_{\infty}\) control problem is equivalent to addressing a discounted Hamilton-Jacobi-Isaacs (HJI) partial differential equation. In this paper, we first provide a precise estimate for the discount factor \(\alpha\) that ensures the existence of a nonnegative stabilizing solution to the HJI equation. This stabilizing solution corresponds to the stable manifold of the characteristic system of the HJI equation, which is a contact Hamiltonian system due to the presence of the discount factor. Secondly, we demonstrate that approximating the optimal controller in a natural manner results in a closed-loop system with a finite \(L_2\)-gain that is nearly less than the gain of the original system. Thirdly, based on the theoretical results obtained, we propose a deep learning algorithm to approximate the optimal controller using the stable manifold of the contact Hamiltonian system associated with the HJI equation. Finally, we apply our method to the \(H_{\infty}\) control of the Allen-Cahn equation to illustrate its effectiveness.
Physics-Constrained Taylor Neural Networks for Learning and Control of Dynamical Systems
Data-driven approaches are increasingly popular for identifying dynamical systems due to improved accuracy and availability of sensor data. However, relying solely on data for identification does not guarantee that the identified systems will maintain their physical properties or that the predicted models will generalize well. In this paper, we propose a novel method for system identification by integrating a neural network as the first-order derivative of a Taylor series expansion instead of learning a dynamical function directly. This approach, called Monotonic Taylor Neural Networks (MTNN), aims to ensure monotonic properties of dynamical systems by constraining the conditions for the output of the neural networks model to be either always non-positive or non-negative. These conditions are constructed in two ways: by designing a new neural network architecture or by regularizing the loss function for training. The proposed method demonstrates better performance compared to methods without constraints on the monotonic properties of the systems when tested with experimental data from two real-world systems, including HVAC and TCLab. Furthermore, MTNN shows good performance in an actual control application when using a model predictive controller for a nonlinear MIMO system, illustrating the practical applications of this method.
C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front
Multi-objective reinforcement learning (MORL) excels at handling rapidly changing preferences in tasks that involve multiple criteria, even for unseen preferences. However, previous dominating MORL methods typically generate a fixed policy set or preference-conditioned policy through multiple training iterations exclusively for sampled preference vectors, and cannot ensure the efficient discovery of the Pareto front. Furthermore, integrating preferences into the input of policy or value functions presents scalability challenges, in particular as the dimension of the state and preference space grow, which can complicate the learning process and hinder the algorithm's performance on more complex tasks. To address these issues, we propose a two-stage Pareto front discovery algorithm called Constrained MORL (C-MORL), which serves as a seamless bridge between constrained policy optimization and MORL. Concretely, a set of policies is trained in parallel in the initialization stage, with each optimized towards its individual preference over the multiple objectives. Then, to fill the remaining vacancies in the Pareto front, the constrained optimization steps are employed to maximize one objective while constraining the other objectives to exceed a predefined threshold. Empirically, compared to recent advancements in MORL methods, our algorithm achieves more consistent and superior performances in terms of hypervolume, expected utility, and sparsity on both discrete and continuous control tasks, especially with numerous objectives (up to nine objectives in our experiments).
comment: 27 pages, 8 figues. In Submission to a conference
SEAL: SEmantic-Augmented Imitation Learning via Language Model
Hierarchical Imitation Learning (HIL) is a promising approach for tackling long-horizon decision-making tasks. While it is a challenging task due to the lack of detailed supervisory labels for sub-goal learning, and reliance on hundreds to thousands of expert demonstrations. In this work, we introduce SEAL, a novel framework that leverages Large Language Models (LLMs)'s powerful semantic and world knowledge for both specifying sub-goal space and pre-labeling states to semantically meaningful sub-goal representations without prior knowledge of task hierarchies. SEAL employs a dual-encoder structure, combining supervised LLM-guided sub-goal learning with unsupervised Vector Quantization (VQ) for more robust sub-goal representations. Additionally, SEAL incorporates a transition-augmented low-level planner for improved adaptation to sub-goal transitions. Our experiments demonstrate that SEAL outperforms state-of-the-art HIL methods and LLM-based planning approaches, particularly in settings with small expert datasets and complex long-horizon tasks.
comment: 18 pages, 5 figures, in submission
Simulation Results of Center-Manifold-Based Identification of Polynomial Nonlinear Systems with Uncontrollable Linearization
Recently, a system identification method based on center manifold is proposed to identify polynomial nonlinear systems with uncontrollable linearization. This note presents a numerical example to show the effectiveness of this method.
Guaranteed-Safe MPPI Through Composite Control Barrier Functions for Efficient Sampling in Multi-Constrained Robotic Systems
We present a new guaranteed-safe model predictive path integral (GS-MPPI) control algorithm that enhances sample efficiency in nonlinear systems with multiple safety constraints. The approach use a composite control barrier function (CBF) along with MPPI to ensure all sampled trajectories are provably safe. We first construct a single CBF constraint from multiple safety constraints with potentially differing relative degrees, using it to create a safe closed-form control law. This safe control is then integrated into the system dynamics, allowing MPPI to optimize over exclusively safe trajectories. The method not only improves computational efficiency but also addresses the myopic behavior often associated with CBFs by incorporating long-term performance considerations. We demonstrate the algorithm's effectiveness through simulations of a nonholonomic ground robot subject to position and speed constraints, showcasing safety and performance.
comment: Preprint submitted to American Control Conference (ACC) 2025
A Miniature Potentiostat for Impedance Spectroscopy and Cyclic Voltammetry in Wearable Sensor Integration
A potentiostat is an analytical device and a crucial component in electrochemical instruments used for studying chemical reaction mechanisms, with potential applications in early diagnosis of disease or critical health conditions. Conventional potentiostats are typically benchtop devices designed for laboratory use, whereas a wearable potentiostat can be interfaced with biochemical sensors for disease diagnostics at home. This work presents a low-power potentiostat designed to connect with a sensor array consisting of eight to ten working electrodes. The potentiostat is capable of running Electrochemical Impedance Spectroscopy and Cyclic Voltammetry. The system is powered by lithium-ion batteries and uses Bluetooth for data transmission to the user. A single ARM M4 microcontroller, integrated with a Bluetooth low-energy radio module, controls the entire system. The accuracy, reliability, and power efficiency of the potentiostat were evaluated and compared against existing commercial benchtop potentiostats. Additionally, we have outlined future steps to enhance circuit miniaturization and power efficiency, aiming to develop fully integrated wearable sensing devices comparable in size to a wristwatch.
Resource Allocation Based on Optimal Transport Theory in ISAC-Enabled Multi-UAV Networks
This paper investigates the resource allocation optimization for cooperative communication with non-cooperative localization in integrated sensing and communications (ISAC)-enabled multi-unmanned aerial vehicle (UAV) cooperative networks. Our goal is to maximize the weighted sum of the system's average sum rate and the localization quality of service (QoS) by jointly optimizing cell association, communication power allocation, and sensing power allocation. Since the formulated problem is a mixed-integer nonconvex problem, we propose the alternating iteration algorithm based on optimal transport theory (AIBOT) to solve the optimization problem more effectively. Simulation results demonstrate that the AIBOT can improve the system sum rate by nearly 12% and reduce the localization Cr'amer-Rao bound (CRB) by almost 29% compared to benchmark algorithms.
Lossy Cooperative UAV Relaying Networks: Outage Probability Analysis and Location Optimization
In this paper, performance of a lossy cooperative unmanned aerial vehicle (UAV) relay communication system is analyzed. In this system, the UAV relay adopts lossy forward (LF) strategy and the receiver has certain distortion requirements for the received information. For the system described above, we first derive the achievable rate distortion region of the system. Then, on the basis of the region analysis, the system outage probability when the channel suffers Nakagami-$m$ fading is analyzed. Finally, we design an optimal relay position identification algorithm based on the Soft Actor-Critic (SAC) algorithm, which determines the optimal UAV position to minimize the outage probability. The simulation results show that the proposed algorithm can optimize the UAV position and reduce the system outage probability effectively.
Safety Verification of Stochastic Systems: A Set-Erosion Approach
We study the safety verification problem for discrete-time stochastic systems. We propose an approach for safety verification termed set-erosion strategy that verifies the safety of a stochastic system on a safe set through the safety of its associated deterministic system on an eroded subset. The amount of erosion is captured by the probabilistic bound on the distance between stochastic trajectories and their associated deterministic counterpart. Building on our recent work [1], we establish a sharp probabilistic bound on this distance. Combining this bound with the set-erosion strategy, we establish a general framework for the safety verification of stochastic systems. Our method is flexible and can work effectively with any deterministic safety verification techniques. We exemplify our method by incorporating barrier functions designed for deterministic safety verification, obtaining barrier certificates much tighter than existing results. Numerical experiments are conducted to demonstrate the efficacy and superiority of our method.
Safe Navigation in Unmapped Environments for Robotic Systems with Input Constraints
This paper presents an approach for navigation and control in unmapped environments under input and state constraints using a composite control barrier function (CBF). We consider the scenario where real-time perception feedback (e.g., LiDAR) is used online to construct a local CBF that models local state constraints (e.g., local safety constraints such as obstacles) in the a priori unmapped environment. The approach employs a soft-maximum function to synthesize a single time-varying CBF from the N most recently obtained local CBFs. Next, the input constraints are transformed into controller-state constraints through the use of control dynamics. Then, we use a soft-minimum function to compose the input constraints with the time-varying CBF that models the a priori unmapped environment. This composition yields a single relaxed CBF, which is used in a constrained optimization to obtain an optimal control that satisfies the state and input constraints. The approach is validated through simulations of a nonholonomic ground robot that is equipped with LiDAR and navigates an unmapped environment. The robot successfully navigates the environment while avoiding the a priori unmapped obstacles and satisfying both speed and input constraints.
comment: Preprint submitted to 2025 American Control Conference (ACC). arXiv admin note: substantial text overlap with arXiv:2409.01458
Information-Driven Search and Track of Novel Space Objects
Space surveillance depends on efficiently directing sensor resources to maintain custody of known catalog objects. However, it remains unclear how to best utilize these resources to rapidly search for and track newly detected space objects. Provided a novel measurement, a search set can be instantiated through admissible region constraints to inform follow-up observations. In lacking well-constrained bounds, this set rapidly spreads in the along-track direction, growing much larger than a follow-up sensor's finite field of view. Moreover, the number of novel objects may be uncertain, and follow-up observations are most commonly corrupted by false positives from known catalog objects and missed detections. In this work, we address these challenges through the introduction of a joint sensor control and multi-target tracking approach. The search set associated to a novel measurement is represented by a Cardinalized Probability Hypothesis Density (CPHD), which jointly tracks the state uncertainty associated to a set of objects and a probability mass function for the true target number. In follow-up sensor scans, the information contained in an empty measurement set, and returns from both novel objects and known catalog objects is succinctly captured through this paradigm. To maximize the utility of a follow-up sensor, we introduce an information-driven sensor control approach for steering the instrument. Our methods are tested on two relevant test cases and we provide a comparative analysis with current naive tasking strategies.
comment: Submitted to the Journal of Astronautical Sciences
Learning Optimal Control and Dynamical Structure of Global Trajectory Search Problems with Diffusion Models
Spacecraft trajectory design is a global search problem, where previous work has revealed specific solution structures that can be captured with data-driven methods. This paper explores two global search problems in the circular restricted three-body problem: hybrid cost function of minimum fuel/time-of-flight and transfers to energy-dependent invariant manifolds. These problems display a fundamental structure either in the optimal control profile or the use of dynamical structures. We build on our prior generative machine learning framework to apply diffusion models to learn the conditional probability distribution of the search problem and analyze the model's capability to capture these structures.
comment: This paper was presented at the AAS/AIAA Astrodynamics Specialist Conference
Analyzing Fitts' Law using Offline and Online Optimal Control with Motor Noise
The cause of the speed-accuracy tradeoff (typically quantified via Fitts' Law) is a debated topic of interest in motor neuroscience, and is commonly studied using tools from control theory. Two prominent theories involve the presence of signal dependent motor noise and planning variability -- these factors are generally incorporated separately. In this work, we study how well the simultaneous presence of both factors explains the speed-accuracy tradeoff. A human arm reaching model is developed with bio-realistic signal dependent motor noise, and a Gaussian noise model is used to deterministically approximate the motor noise. Both offline trajectory optimization and online model predictive control are used to simulate the planning and execution of several different reaching tasks with varying target sizes and movement durations. These reaching trajectories are then compared to experimental human reaching data, revealing that both models produce behavior consistent with humans, and the speed-accuracy tradeoff is present in both online and offline control. These results suggest the speed-accuracy tradeoff is likely caused by a combination of these two factors, and also that it plays a role in both offline and online computation.
comment: Submitted to IEEE American Control Conference
Human Balancing on a Log: A Switched Multi-Layer Controller
We study the task of balancing a human on a log that is fixed in place. Balancing on a log is substantially more challenging than balancing on a flat surface -- to achieve stability, we use a switched multi-layer controller. The controller consists of an upper-layer LQR planner (akin to the central nervous system) that coordinates ankle and hip torques, and lower-layer PID trackers (akin to local motor units) that follow this plan subject to nonlinear dynamics. Additionally, the controller switches between three operational modes depending on the current state of the human. The efficacy of the controller is verified in simulation, where our controller is able to stabilize the human for a variety of initial conditions. We also show that this controller is compatible with muscle-based actuation and imperfect sensing, making it a promising candidate for modeling motor control under challenging conditions in a more bio-realistic way.
comment: Submitted to IEEE American Control Conference
Dissipative Avoidance Feedback for Reactive Navigation Under Second-Order Dynamics
This paper introduces DAF (Dissipative Avoidance Feedback), a novel approach for autonomous robot navigation in unknown, obstacle-filled environments with second-order dynamics. Unlike traditional APF (Artificial Potential Field) methods, which rely on repulsive forces based solely on position, DAF employs a dissipative feedback mechanism that adjusts the robot's motion in response to both its position and velocity, ensuring smoother, more natural obstacle avoidance. The proposed continuously differentiable controller solves the motion-to-goal problem while guaranteeing collision-free navigation by considering the robot's state and local obstacle distance information. We show that the controller guarantees safe navigation in generic $n$-dimensional environments and that all undesired $\omega$-limit points are unstable under certain \textit{controlled} curvature conditions. Designed for real-time implementation, DAF requires only locally measured data from limited-range sensors (e.g., LiDAR, depth cameras), making it particularly effective for robots navigating unknown workspaces.
comment: 7 pages, 7 figures
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy. We prove that this method results in the maximal robust RAS set in the absence of training errors and demonstrate that it enables RAS in complex environments, scales to high-dimensional systems, and achieves higher success rates for the RAS task compared to previous methods, validated through one simulation and two high-dimensional experiments.
Approximation Schemes for POMPDs with Continuous Spaces and Their Near Optimality
We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applications, due to the state space being lifted to the space of probability measures. Generalizing recent work, in this paper we present rigorous approximation methods via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions. We show that the resulting policy is near-optimal under some regularity assumptions on the channel, and under certain controlled filter stability requirements for the hidden state process. Furthermore, by quantizing the measurements, we are able to utilize refined filter stability conditions. We also provide a Q learning algorithm that uses a finite memory of discretized information variables, and prove its convergence to the optimality equation of the finite fully observed MDP constructed using the approximation method.
Gait Optimization for Legged Systems Through Mixed Distribution Cross-Entropy Optimization
Legged robotic systems can play an important role in real-world applications due to their superior load-bearing capabilities, enhanced autonomy, and effective navigation on uneven terrain. They offer an optimal trade-off between mobility and payload capacity, excelling in diverse environments while maintaining efficiency in transporting heavy loads. However, planning and optimizing gaits and gait sequences for these robots presents significant challenges due to the complexity of their dynamic motion and the numerous optimization variables involved. Traditional trajectory optimization methods address these challenges by formulating the problem as an optimization task, aiming to minimize cost functions, and to automatically discover contact sequences. Despite their structured approach, optimization-based methods face substantial difficulties, particularly because such formulations result in highly nonlinear and difficult to solve problems. To address these limitations, we propose CrEGOpt, a bi-level optimization method that combines traditional trajectory optimization with a black-box optimization scheme. CrEGOpt at the higher level employs the Mixed Distribution Cross-Entropy Method to optimize both the gait sequence and the phase durations, thus simplifying the lower level trajectory optimization problem. This approach allows for fast solutions of complex gait optimization problems. Extensive evaluation in simulated environments demonstrates that CrEGOpt can find solutions for biped, quadruped, and hexapod robots in under 10 seconds. This novel bi-level optimization scheme offers a promising direction for future research in automatic contact scheduling.
comment: 8 pages, 7 figures, Accepted at Humanoids 2024
Absolute centrality in a signed Friedkin-Johnsen based model: a graphical characterisation of influence
This paper studies the evolution of opinions governed by a Friedkin Johnsen (FJ) based model in arbitrary network structures with signed interactions. The agents contributing to the opinion formation are characterised as being influential. Initially, the agents are classified as opinion leaders and followers based on network connectivity and the nature of interactions. However, the addition of stubbornness leads to interesting behaviours wherein a non influential agent can now become influential and vice versa. Thereafter, a signal flow graph (SFG) based method is proposed to quantify the influence of an influential agents' opinions. Additionally, it helps illustrate the role played by network topology in shaping the final opinions of the agents. Based on this analysis, the absolute centrality measure is proposed to determine the overall influence of all the agents in the network. Unlike most of the existing measures, it is applicable to any network structure and considers the effect of stubbornness and antagonism. Examples are presented throughout the paper to illustrate and validate these results.
comment: 13 pages
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
Two-stage adaptive robust optimization (ARO) is a powerful approach for planning under uncertainty, balancing first-stage decisions with recourse decisions made after uncertainty is realized. To account for uncertainty, modelers typically define a simple uncertainty set over which potential outcomes are considered. However, classical methods for defining these sets unintentionally capture a wide range of unrealistic outcomes, resulting in overly-conservative and costly planning in anticipation of unlikely contingencies. In this work, we introduce AGRO, a solution algorithm that performs adversarial generation for two-stage adaptive robust optimization using a variational autoencoder. AGRO generates high-dimensional contingencies that are simultaneously adversarial and realistic, improving the robustness of first-stage decisions at a lower planning cost than standard methods. To ensure generated contingencies lie in high-density regions of the uncertainty distribution, AGRO defines a tight uncertainty set as the image of "latent" uncertainty sets under the VAE decoding transformation. Projected gradient ascent is then used to maximize recourse costs over the latent uncertainty sets by leveraging differentiable optimization methods. We demonstrate the cost-efficiency of AGRO by applying it to both a synthetic production-distribution problem and a real-world power system expansion setting. We show that AGRO outperforms the standard column-and-constraint algorithm by up to 1.8% in production-distribution planning and up to 11.6% in power system expansion.
Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants
Transformers are crucial for reliable and efficient power system operations, particularly in supporting the integration of renewable energy. Effective monitoring of transformer health is critical to maintain grid stability and performance. Thermal insulation ageing is a key transformer failure mode, which is generally tracked by monitoring the hotspot temperature (HST). However, HST measurement is complex, costly, and often estimated from indirect measurements. Existing HST models focus on space-agnostic thermal models, providing worst-case HST estimates. This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution. The computational accuracy of the PINN model is improved through the implementation of the Residual-Based Attention (PINN-RBA) scheme that accelerates the PINN model convergence. The PINN-RBA model is benchmarked against self-adaptive attention schemes and classical vanilla PINN configurations. For the first time, PINN based oil temperature predictions are used to estimate spatio-temporal transformer winding temperature values, validated through PDE numerical solution and fiber optic sensor measurements. Furthermore, the spatio-temporal transformer ageing model is inferred, which supports transformer health management decision-making. Results are validated with a distribution transformer operating on a floating photovoltaic power plant.
comment: 23 pages, 18 figures
Solution of the Probabilistic Lambert Problem: Connections with Optimal Mass Transport, Schrödinger Bridge and Reaction-Diffusion PDEs
The Lambert problem originated in orbital mechanics. It concerns with determining the initial velocity for a boundary value problem involving the dynamical constraint due to gravitational potential with additional time horizon and endpoint position constraints. Its solution has application in transferring a spacecraft from a given initial to a given terminal position within prescribed flight time via velocity control. We consider a probabilistic variant of the Lambert problem where the knowledge of the endpoint constraints in position vectors are replaced by the knowledge of their respective joint probability density functions. We show that the Lambert problem with endpoint joint probability density constraints is a generalized optimal mass transport (OMT) problem, thereby connecting this classical astrodynamics problem with a burgeoning area of research in modern stochastic control and stochastic machine learning. This newfound connection allows us to rigorously establish the existence and uniqueness of solution for the probabilistic Lambert problem. The same connection also helps to numerically solve the probabilistic Lambert problem via diffusion regularization, i.e., by leveraging further connection of the OMT with the Schr\"odinger bridge problem (SBP). This also shows that the probabilistic Lambert problem with additive dynamic process noise is a generalized SBP, and can be solved numerically using the so-called Schr\"odinger factors, as we do in this work. Our analysis leads to solving a system of reaction-diffusion PDEs where the gravitational potential appears as the reaction rate.
Feedback Linearizable Discretizations of Second Order Mechanical Systems using Retraction Maps
Mechanical systems are most often described by a set of continuous-time, nonlinear, second-order differential equations (SODEs) of a particular structure governed by the covariant derivative. The digital implementation of controllers for such systems requires a discrete model of the system and hence requires numerical discretization schemes. Feedback linearizability of such sampled systems, however, depends on the discretization scheme employed. In this article, we utilize retraction maps and their lifts to construct feedback linearizable discretizations for SODEs which can be applied to many mechanical systems.
Identification For Control Based on Neural Networks: Approximately Linearizable Models
This work presents a control-oriented identification scheme for efficient control design and stability analysis of nonlinear systems. Neural networks are used to identify a discrete-time nonlinear state-space model to approximate time-domain input-output behavior of a nonlinear system. The network is constructed such that the identified model is approximately linearizable by feedback, ensuring that the control law trivially follows from the learning stage. After the identification and quasi-linearization procedures, linear control theory comes at hand to design robust controllers and study stability of the closed-loop system. The effectiveness and interest of the methodology are illustrated throughout the paper on popular benchmarks for system identification.
comment: 15 pages, 3 figures, 6 tables, accepted as a poster in SysDO 2024, Stuttgart, Germany
An Artificial Neural Network based approach for Harmonic Component Prediction in a Distribution Line
With the increasing use of nonlinear devices in both generation and consumption of power, it is essential that we develop accurate and quick control for active filters to suppress harmonics. Time delays between input and output are catastrophic for such filters which rely on real-time operation. Artificial Neural Networks (ANNs) are capable of modeling complex nonlinear systems through adjustments in their learned parameters. Once properly trained, they can produce highly accurate predictions at an instantaneous time frame. Leveraging these qualities, various complex control systems may be replaced or aided by neural networks to provide quick and precise responses. This paper proposes an ANN-based approach for the prediction of individual harmonic components using minimal inputs. By extracting and analyzing the nature of harmonic component magnitudes obtained from the survey of a particular area through real-time measurements, a sequential pattern in their occurrence is observed. Various neural network architectures are trained using the collected data and their performances are evaluated. The best-performing model, whose losses are minimal, is then used to observe the harmonic cancellation for multiple unseen cases through a simplified simulation in hardware-in-the-loop. These neural network structures, which produce instantaneous and accurate outputs, are effective in harmonic filtering.
Closed-Loop Sensitivity Identification for Cross-Directional Systems
At Diamond Light Source, the UK's national synchrotron facility, electron beam disturbances are attenuated by the fast orbit feedback (FOFB), which controls a cross-directional (CD) system with hundreds of inputs and outputs. Due to the inability to measure the disturbances in real-time, the closed-loop sensitivity of the FOFB can only be evaluated indirectly, making it difficult to compare FOFB algorithms and detect faults. Existing methods rely on comparing open-loop with closed-loop measurements, but they are prone to instabilities and actuator saturation because of the system's strong directionality. Here, we introduce a reference signal to estimate the complementary sensitivity in closed loop. By decoupling the system into sets of single-input, single-output (SISO) systems, the reference signal is designed mode-by-mode, accommodating the system's strong directionality. Additionally, a lower bound on the reference amplitude is derived to limit the estimation error in the presence of disturbances and measurement noise. This method enables the use of SISO system identification techniques, making it suitable for large-scale systems. It not only facilitates performance estimation of ill-conditioned CD systems in closed-loop but also provides a signal for fault detection. The potential applications of this approach extend to other CD systems, such as papermaking, steel rolling, or battery manufacturing processes.
Stable Reduced-Rank VAR Identification
The vector autoregression (VAR) has been widely used in system identification, econometrics, natural science, and many other areas. However, when the state dimension becomes large the parameter dimension explodes. So rank reduced modelling is attractive and is well developed. But a fundamental requirement in almost all applications is stability of the fitted model. And this has not been addressed in the rank reduced case. Here, we develop, for the first time, a closed-form formula for an estimator of a rank reduced transition matrix which is guaranteed to be stable. We show that our estimator is consistent and asymptotically statistically efficient and illustrate it in comparative simulations.
comment: 17 pages, 6 figures
Data-driven distributionally robust MPC for systems with multiplicative noise: A semi-infinite semi-definite programming approach
This article introduces a novel distributionally robust model predictive control (DRMPC) algorithm for a specific class of controlled dynamical systems where the disturbance multiplies the state and control variables. These classes of systems arise in mathematical finance, where the paradigm of distributionally robust optimization (DRO) fits perfectly, and this serves as the primary motivation for this work. We recast the optimal control problem (OCP) as a semi-definite program with an infinite number of constraints, making the ensuing optimization problem a \emph{semi-infinite semi-definite program} (SI-SDP). To numerically solve the SI-SDP, we advance an approach for solving convex semi-infinite programs (SIPs) to SI-SDPs and, subsequently, solve the DRMPC problem. A numerical example is provided to show the effectiveness of the algorithm.
comment: To appear in the proceedings of Mathematical Theory of Networks and Systems (MTNS) 2024
Learning Chaotic Dynamics with Embedded Dissipativity
Chaotic dynamics, commonly seen in weather systems and fluid turbulence, are characterized by their sensitivity to initial conditions, which makes accurate prediction challenging. Despite its sensitivity to initial perturbations, many chaotic systems observe dissipative behaviors and ergodicity. Therefore, recently various approaches have been proposed to develop data-driven models preserving invariant statistics over long horizons. Although these methods have shown empirical success in reducing instances of unbounded trajectory generation, many of the models are still prone to generating unbounded trajectories, leading to invalid statistics evaluation. In this paper, we propose a novel neural network architecture that simultaneously learns a dissipative dynamics emulator that guarantees to generate bounded trajectories and an energy-like function that governs the dissipative behavior. More specifically, by leveraging control-theoretic ideas, we derive algebraic conditions based on the learned energy-like function that ensure asymptotic convergence to an invariant level set. Using these algebraic conditions, our proposed model enforces dissipativity through a ReLU projection layer, which provides formal trajectory boundedness guarantees. Furthermore, the invariant level set provides an outer estimate for the strange attractor, which is known to be very difficult to characterize due to its complex geometry. We demonstrate the capability of our model in producing bounded long-horizon trajectory forecasts and characterizing the attractor for chaotic dynamical systems including Lorenz 96 and a truncated Kuramoto-Sivashinsky equation.
Understanding the Impact of Coalitions between EV Charging Stations
The rapid growth of electric vehicles (EVs) is driving the expansion of charging infrastructure globally. As charging stations become ubiquitous, their substantial electricity consumption can influence grid operation and electricity pricing. Naturally, \textit{some} groups of charging stations, which could be jointly operated by a company, may coordinate to decide their charging profile. While coordination among all charging stations is ideal, it is unclear if coordination of some charging stations is better than no coordination. In this paper, we analyze this intermediate regime between no and full coordination of charging stations. We model EV charging as a non-cooperative aggregative game, where each station's cost is determined by both monetary payments tied to reactive electricity prices on the grid and its sensitivity to deviations from a desired charging profile. We consider a solution concept that we call $\mathcal{C}$-Nash equilibrium, which is tied to a coalition $\mathcal{C}$ of charging stations coordinating to reduce their costs. We provide sufficient conditions, in terms of the demand and sensitivity of charging stations, to determine when independent (aka uncoordinated) operation of charging stations could result in lower overall costs to charging stations, coalition and charging stations outside the coalition. Somewhat counter to common intuition, we show numerical instances where allowing charging stations to operate independently is better than coordinating a subset of stations as a coalition. Jointly, these results provide operators of charging stations insights into how to coordinate their charging behavior, and open several research directions.
comment: 20 pages, 5 figures
BVE + EKF: A viewpoint estimator for the estimation of the object's position in the 3D task space using Extended Kalman Filters
RGB-D sensors face multiple challenges operating under open-field environments because of their sensitivity to external perturbations such as radiation or rain. Multiple works are approaching the challenge of perceiving the 3D position of objects using monocular cameras. However, most of these works focus mainly on deep learning-based solutions, which are complex, data-driven, and difficult to predict. So, we aim to approach the problem of predicting the 3D objects' position using a Gaussian viewpoint estimator named best viewpoint estimator (BVE) powered by an extended Kalman filter (EKF). The algorithm proved efficient on the tasks and reached a maximum average Euclidean error of about 32 mm. The experiments were deployed and evaluated in MATLAB using artificial Gaussian noise. Future work aims to implement the system in a robotic system.
comment: Accepted to ICINCO - 21st International Conference on Informatics in Control, Automation and Robotics
ERIC: Estimating Rainfall with Commodity Doorbell Camera for Precision Residential Irrigation
Current state-of-the-art residential irrigation systems, such as WaterMyYard, rely on rainfall data from nearby weather stations to adjust irrigation amounts. However, the accuracy of rainfall data is compromised by the limited spatial resolution of rain gauges and the significant variability of hyperlocal rainfall, leading to substantial water waste. To improve irrigation efficiency, we developed a cost-effective irrigation system, dubbed ERIC, which employs machine learning models to estimate rainfall from commodity doorbell camera footage and optimizes irrigation schedules without human intervention. Specifically, we: a) designed novel visual and audio features with lightweight neural network models to infer rainfall from the camera at the edge, preserving user privacy; b) built a complete end-to-end irrigation system on Raspberry Pi 4, costing only \$75. We deployed the system across five locations (collecting over 750 hours of video) with varying backgrounds and light conditions. Comprehensive evaluation validates that ERIC achieves state-of-the-art rainfall estimation performance ($\sim$ 5mm/day), saving 9,112 gallons/month of water, translating to \$28.56/month in utility savings. Data and code are available at https://github.com/LENSS/ERIC-BuildSys2024.git
comment: BuildSys 2024
Second-Order Algorithms for Finding Local Nash Equilibria in Zero-Sum Games
Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors. To overcome this challenge, algorithms must account for subtleties involving the curvatures of players' costs. To this end, we leverage dynamical system theory and develop a second-order algorithm for finding a local Nash equilibrium in the smooth, possibly nonconvex-nonconcave, zero-sum game setting. First, we prove that this novel method guarantees convergence to only local Nash equilibria with a local linear convergence rate. We then interpret a version of this method as a modified Gauss-Newton algorithm with local superlinear convergence to the neighborhood of a point that satisfies first-order local Nash equilibrium conditions. In comparison, current related state-of-the-art methods do not offer convergence rate guarantees. Furthermore, we show that this approach naturally generalizes to settings with convex and potentially coupled constraints while retaining earlier guarantees of convergence to only local (generalized) Nash equilibria.
Hybrid Feedback for Three-dimensional Convex Obstacle Avoidance (Extended version)
We propose a hybrid feedback control scheme for the autonomous robot navigation problem in three-dimensional environments with arbitrarily-shaped convex obstacles. The proposed hybrid control strategy, which consists in switching between the move-to-target mode and the obstacle-avoidance mode, guarantees global asymptotic stability of the target location in the obstacle-free workspace. We also provide a procedure for the implementation of the proposed hybrid controller in a priori unknown environments and validate its effectiveness through simulation results.
comment: 13 pages, 6 figures
Synthesis of General Decoupling Networks Using Transmission Lines
In this paper, we introduce a synthesis technique for transmission line based decoupling networks, which find application in coupled systems such as multiple-antenna systems and compact antenna arrays. Employing the generalized $\pi$-network and the transmission line analysis technique, we reduce the decoupling network design into simple matrix calculations. The synthesized decoupling network is essentially a generalized $\pi$-network with transmission lines at all branches. A standard electrical length of $3\lambda/8$ and $5\lambda/8$ are chosen to simplify the physical implementation, leaving the characteristic impedances of the transmission line branches the main design parameters. The advantage of this proposed decoupling network is that it can be implemented using transmission lines, ensuring better control on loss, performance consistency and higher power handling capability when compared with lumped components, and can be easily scaled for operation at different frequencies. A two-port microstrip antenna system at 1.2 GHz and a three-port monopole antenna system at 1 GHz are investigated respectively to demonstrate the validity of the proposed synthesis method, and perfect decoupling ($S_{21}<-50$dB) are achieved at both design frequencies.
comment: 5 pages
Systems and Control (EESS)
Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments
Navigating complex environments requires Unmanned Aerial Vehicles (UAVs) and autonomous systems to perform trajectory tracking and obstacle avoidance in real-time. While many control strategies have effectively utilized linear approximations, addressing the non-linear dynamics of UAV, especially in obstacle-dense environments, remains a key challenge that requires further research. This paper introduces a Non-linear Model Predictive Control (NMPC) framework for the DJI Matrice 100, addressing these challenges by using a dynamic model and B-spline interpolation for smooth reference trajectories, ensuring minimal deviation while respecting safety constraints. The framework supports various trajectory types and employs a penalty-based cost function for control accuracy in tight maneuvers. The framework utilizes CasADi for efficient real-time optimization, enabling the UAV to maintain robust operation even under tight computational constraints. Simulation and real-world indoor and outdoor experiments demonstrated the NMPC ability to adapt to disturbances, resulting in smooth, collision-free navigation.
comment: This manuscript has 7 pages and 8 figures, detailing NMPC for UAV obstacle avoidance using DJI UAVs. It features simulations, experimental results, and uses CasADi for optimization with ROS integration. Code and media at https://github.com/larasupernovae/nmpc_flash_multi_obstacle
Numerical optimal control for delay differential equations: A simultaneous approach based on linearization of the delayed state
Time delays are ubiquitous in industry, and they must be accounted for when designing control strategies. However, numerical optimal control (NOC) of delay differential equations (DDEs) is challenging because it requires specialized discretization methods and the time delays may depend on the manipulated inputs or state variables. Therefore, in this work, we propose to linearize the delayed states around the current time. This results in a set of implicit differential equations, and we compare the steady states and the corresponding stability criteria of the DDEs and the approximate system. Furthermore, we propose a simultaneous approach for NOC of DDEs based on the linearization, and we discretize the approximate system using Euler's implicit method. Finally, we present a numerical example involving a molten salt nuclear fission reactor.
comment: 6 pages, 4 figures, submitted to a conference
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
comment: 16 pages, 17 figures
Toward Neuronal Implementations of Delayed Optimal Control
Animal sensorimotor behavior is frequently modeled using optimal controllers. However, it is unclear how the neuronal circuits within the animal's nervous system implement optimal controller-like behavior. In this work, we study the question of implementing a delayed linear quadratic regulator with linear dynamical "neurons" on a muscle model. We show that for any second-order controller, there are three minimal neural circuit configurations that implement the same controller. Furthermore, the firing rate characteristics of each circuit can vary drastically, even as the overall controller behavior is preserved. Along the way, we introduce concepts that bridge controller realizations to neural implementations that are compatible with known neuronal delay structures.
comment: Submitted to IEEE American Control Conference
Automated Music Therapy for Anxiety and Depression Management in Older People (AMITY)
The onset of old age brings physiological and mental changes, with anxiety and depression being common mental disorders that can trigger other health issues and reduce lifespan. However, due to a global shortage of mental health professionals, combined with a growing population and limited awareness, these disorders often go undiagnosed. Music therapy offers a reliable method to address psychological, emotional, and cognitive needs. This paper presents an approach that monitors anxiety and depression symptoms in real time using low-complexity body sensors, followed by automated personalised music therapy, reducing the dependence on therapists and improving mental health care accessibility.
comment: 10 pages, 5 figures
SwarmCVT: Centroidal Voronoi Tessellation-Based Path Planning for Very-Large-Scale Robotics
Swarm robotics, or very large-scale robotics (VLSR), has many meaningful applications for complicated tasks. However, the complexity of motion control and energy costs stack up quickly as the number of robots increases. In addressing this problem, our previous studies have formulated various methods employing macroscopic and microscopic approaches. These methods enable microscopic robots to adhere to a reference Gaussian mixture model (GMM) distribution observed at the macroscopic scale. As a result, optimizing the macroscopic level will result in an optimal overall result. However, all these methods require systematic and global generation of Gaussian components (GCs) within obstacle-free areas to construct the GMM trajectories. This work utilizes centroidal Voronoi tessellation to generate GCs methodically. Consequently, it demonstrates performance improvement while also ensuring consistency and reliability.
comment: Submitted to American Control Conference (ACC) 2025
Behavior Trees in Functional Safety Supervisors for Autonomous Vehicles
The rapid advancements in autonomous vehicle software present both opportunities and challenges, especially in enhancing road safety. The primary objective of autonomous vehicles is to reduce accident rates through improved safety measures. However, the integration of new algorithms into the autonomous vehicle, such as Artificial Intelligence methods, raises concerns about the compliance with established safety regulations. This paper introduces a novel software architecture based on behavior trees, aligned with established standards and designed to supervise vehicle functional safety in real time. It specifically addresses the integration of algorithms into industrial road vehicles, adhering to the ISO 26262. The proposed supervision methodology involves the detection of hazards and compliance with functional and technical safety requirements when a hazard arises. This methodology, implemented in this study in a Renault M\'egane (currently at SAE level 3 of automation), not only guarantees compliance with safety standards, but also paves the way for safer and more reliable autonomous driving technologies.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Load Balancing-based Topology Adaptation for Integrated Access and Backhaul Networks
Integrated access and backhaul (IAB) technology is a flexible solution for network densification. IAB nodes can also be deployed in moving nodes such as buses and trains, i.e., mobile IAB (mIAB). As mIAB nodes can move around the coverage area, the connection between mIAB nodes and their parent macro base stations (BSs), IAB donor, is sometimes required to change in order to keep an acceptable backhaul link, the so called topology adaptation (TA). The change from one IAB donor to another may strongly impact the system load distribution, possibly causing unsatisfactory backhaul service due to the lack of radio resources. Based on this, TA should consider both backhaul link quality and traffic load. In this work, we propose a load balancing algorithm based on TA for IAB networks, and compare it with an approach in which TA is triggered based on reference signal received power (RSRP) only. The results show that our proposed algorithm improves the passengers worst connections throughput in uplink (UL) and, more modestly, also in downlink (DL), without impairing the pedestrian quality of service (QoS) significantly.
comment: Paper submitted to Journal of Communication and Information Systems (JCIS)
Cellular Network Densification: a System-level Analysis with IAB, NCR and RIS
As the number of user equipments increases in fifth generation (5G) and beyond, it is desired to densify the cellular network with auxiliary nodes assisting the base stations. Examples of these nodes are integrated access and backhaul (IAB) nodes, network-controlled repeaters (NCRs) and reconfigurable intelligent surfaces (RISs). In this context, this work presents a system level overview of these three nodes. Moreover, this work evaluates through simulations the impact of network planning aiming at enhancing the performance of a network used to cover an outdoor sport event. We show that, in the considered scenario, in general, IAB nodes provide an improved signal to interference-plus-noise ratio and throughput, compared to NCRs and RISs. However, there are situations where NCR outperforms IAB due to higher level of interference caused by the latter. Finally, we show that the deployment of these nodes in unmanned aerial vehicles (UAVs) also achieves performance gains due to their aerial mobility. However, UAV constraints related to aerial deployment may prevent these nodes from reaching results as good as the ones achieved by their stationary deployment.
comment: Paper submitted to IEEE Systems Journal
Cross-Domain Comparative Analysis of Digital Twins and Universalised Solutions
Digitalisation is one of the main drivers of most economic sectors nowadays and the digital twin, as a reification of digitalisation for complex systems has attracted much attention from both academics and industry. There have been studies focusing on digital twins in a specific sector while there are few exercising insightful comparisons of digital twins from different domains. Considering the digital twinning is a cross-domain transformation, it is beneficial to establish the principles of universality and variation that can explain similarities and differences in any digital twins. This paper first delivers a comparative analysis of digital twins in five domains through a six-dimensional characterisation framework. Then, by departing from the correlations among the domain-specific DT development, a cross-domain Digital Twin Platform-as-a-Service (DT-PaaS) is proposed to universalise the common process, tools and applications, meanwhile being inclusive of variations of every digital twin instance. As a centralised data, modeling and service platform, it is expected to break the barriers between domains by enabling the cross-domain digital twin data sharing, interoperability and development synergy and tackle some complex global challenges such as climate challenge, net zero, pandemics, etc.
Equivalence between Geometric Frequency and Lagrange Derivative
The paper shows the equivalence between the geometric frequency of an electric quantity, namely, voltage and current, and the Lagrange derivative of a stream-line of a fluid. The geometric frequency is a concept recently proposed by the author and is a generalization of the instantaneous frequency, a quantity that is particularly important for the analysis and the control of electric power systems. On the other hand, the Lagrange derivative is mostly utilized in fluid dynamics and helps decomposing the time derivative into various components. The paper shows how these components relate to the elements of the geometric frequency. The paper also shows, through a variety of numerical examples, how the decomposition of the Lagrange derivative helps identifying the distortion of the waveform of a measured electric quantity and how this information can be utilized to classify system operating conditions.
Semantic Communication and Control Co-Design for Multi-Objective Correlated Dynamics
This letter introduces a machine-learning approach to learning the semantic dynamics of correlated systems with different control rules and dynamics. By leveraging the Koopman operator in an autoencoder (AE) framework, the system's state evolution is linearized in the latent space using a dynamic semantic Koopman (DSK) model, capturing the baseline semantic dynamics. Signal temporal logic (STL) is incorporated through a logical semantic Koopman (LSK) model to encode system-specific control rules. These models form the proposed logical Koopman AE framework that reduces communication costs while improving state prediction accuracy and control performance, showing a 91.65% reduction in communication samples and significant performance gains in simulation.
Optimal $H_{\infty}$ control based on stable manifold of discounted Hamilton-Jacobi-Isaacs equation
The optimal \(H_{\infty}\) control problem over an infinite time horizon, which incorporates a performance function with a discount factor \(e^{-\alpha t}\) (\(\alpha > 0\)), is important in various fields. Solving this optimal \(H_{\infty}\) control problem is equivalent to addressing a discounted Hamilton-Jacobi-Isaacs (HJI) partial differential equation. In this paper, we first provide a precise estimate for the discount factor \(\alpha\) that ensures the existence of a nonnegative stabilizing solution to the HJI equation. This stabilizing solution corresponds to the stable manifold of the characteristic system of the HJI equation, which is a contact Hamiltonian system due to the presence of the discount factor. Secondly, we demonstrate that approximating the optimal controller in a natural manner results in a closed-loop system with a finite \(L_2\)-gain that is nearly less than the gain of the original system. Thirdly, based on the theoretical results obtained, we propose a deep learning algorithm to approximate the optimal controller using the stable manifold of the contact Hamiltonian system associated with the HJI equation. Finally, we apply our method to the \(H_{\infty}\) control of the Allen-Cahn equation to illustrate its effectiveness.
Physics-Constrained Taylor Neural Networks for Learning and Control of Dynamical Systems
Data-driven approaches are increasingly popular for identifying dynamical systems due to improved accuracy and availability of sensor data. However, relying solely on data for identification does not guarantee that the identified systems will maintain their physical properties or that the predicted models will generalize well. In this paper, we propose a novel method for system identification by integrating a neural network as the first-order derivative of a Taylor series expansion instead of learning a dynamical function directly. This approach, called Monotonic Taylor Neural Networks (MTNN), aims to ensure monotonic properties of dynamical systems by constraining the conditions for the output of the neural networks model to be either always non-positive or non-negative. These conditions are constructed in two ways: by designing a new neural network architecture or by regularizing the loss function for training. The proposed method demonstrates better performance compared to methods without constraints on the monotonic properties of the systems when tested with experimental data from two real-world systems, including HVAC and TCLab. Furthermore, MTNN shows good performance in an actual control application when using a model predictive controller for a nonlinear MIMO system, illustrating the practical applications of this method.
C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front
Multi-objective reinforcement learning (MORL) excels at handling rapidly changing preferences in tasks that involve multiple criteria, even for unseen preferences. However, previous dominating MORL methods typically generate a fixed policy set or preference-conditioned policy through multiple training iterations exclusively for sampled preference vectors, and cannot ensure the efficient discovery of the Pareto front. Furthermore, integrating preferences into the input of policy or value functions presents scalability challenges, in particular as the dimension of the state and preference space grow, which can complicate the learning process and hinder the algorithm's performance on more complex tasks. To address these issues, we propose a two-stage Pareto front discovery algorithm called Constrained MORL (C-MORL), which serves as a seamless bridge between constrained policy optimization and MORL. Concretely, a set of policies is trained in parallel in the initialization stage, with each optimized towards its individual preference over the multiple objectives. Then, to fill the remaining vacancies in the Pareto front, the constrained optimization steps are employed to maximize one objective while constraining the other objectives to exceed a predefined threshold. Empirically, compared to recent advancements in MORL methods, our algorithm achieves more consistent and superior performances in terms of hypervolume, expected utility, and sparsity on both discrete and continuous control tasks, especially with numerous objectives (up to nine objectives in our experiments).
comment: 27 pages, 8 figues. In Submission to a conference
SEAL: SEmantic-Augmented Imitation Learning via Language Model
Hierarchical Imitation Learning (HIL) is a promising approach for tackling long-horizon decision-making tasks. While it is a challenging task due to the lack of detailed supervisory labels for sub-goal learning, and reliance on hundreds to thousands of expert demonstrations. In this work, we introduce SEAL, a novel framework that leverages Large Language Models (LLMs)'s powerful semantic and world knowledge for both specifying sub-goal space and pre-labeling states to semantically meaningful sub-goal representations without prior knowledge of task hierarchies. SEAL employs a dual-encoder structure, combining supervised LLM-guided sub-goal learning with unsupervised Vector Quantization (VQ) for more robust sub-goal representations. Additionally, SEAL incorporates a transition-augmented low-level planner for improved adaptation to sub-goal transitions. Our experiments demonstrate that SEAL outperforms state-of-the-art HIL methods and LLM-based planning approaches, particularly in settings with small expert datasets and complex long-horizon tasks.
comment: 18 pages, 5 figures, in submission
Simulation Results of Center-Manifold-Based Identification of Polynomial Nonlinear Systems with Uncontrollable Linearization
Recently, a system identification method based on center manifold is proposed to identify polynomial nonlinear systems with uncontrollable linearization. This note presents a numerical example to show the effectiveness of this method.
Guaranteed-Safe MPPI Through Composite Control Barrier Functions for Efficient Sampling in Multi-Constrained Robotic Systems
We present a new guaranteed-safe model predictive path integral (GS-MPPI) control algorithm that enhances sample efficiency in nonlinear systems with multiple safety constraints. The approach use a composite control barrier function (CBF) along with MPPI to ensure all sampled trajectories are provably safe. We first construct a single CBF constraint from multiple safety constraints with potentially differing relative degrees, using it to create a safe closed-form control law. This safe control is then integrated into the system dynamics, allowing MPPI to optimize over exclusively safe trajectories. The method not only improves computational efficiency but also addresses the myopic behavior often associated with CBFs by incorporating long-term performance considerations. We demonstrate the algorithm's effectiveness through simulations of a nonholonomic ground robot subject to position and speed constraints, showcasing safety and performance.
comment: Preprint submitted to American Control Conference (ACC) 2025
A Miniature Potentiostat for Impedance Spectroscopy and Cyclic Voltammetry in Wearable Sensor Integration
A potentiostat is an analytical device and a crucial component in electrochemical instruments used for studying chemical reaction mechanisms, with potential applications in early diagnosis of disease or critical health conditions. Conventional potentiostats are typically benchtop devices designed for laboratory use, whereas a wearable potentiostat can be interfaced with biochemical sensors for disease diagnostics at home. This work presents a low-power potentiostat designed to connect with a sensor array consisting of eight to ten working electrodes. The potentiostat is capable of running Electrochemical Impedance Spectroscopy and Cyclic Voltammetry. The system is powered by lithium-ion batteries and uses Bluetooth for data transmission to the user. A single ARM M4 microcontroller, integrated with a Bluetooth low-energy radio module, controls the entire system. The accuracy, reliability, and power efficiency of the potentiostat were evaluated and compared against existing commercial benchtop potentiostats. Additionally, we have outlined future steps to enhance circuit miniaturization and power efficiency, aiming to develop fully integrated wearable sensing devices comparable in size to a wristwatch.
Resource Allocation Based on Optimal Transport Theory in ISAC-Enabled Multi-UAV Networks
This paper investigates the resource allocation optimization for cooperative communication with non-cooperative localization in integrated sensing and communications (ISAC)-enabled multi-unmanned aerial vehicle (UAV) cooperative networks. Our goal is to maximize the weighted sum of the system's average sum rate and the localization quality of service (QoS) by jointly optimizing cell association, communication power allocation, and sensing power allocation. Since the formulated problem is a mixed-integer nonconvex problem, we propose the alternating iteration algorithm based on optimal transport theory (AIBOT) to solve the optimization problem more effectively. Simulation results demonstrate that the AIBOT can improve the system sum rate by nearly 12% and reduce the localization Cr'amer-Rao bound (CRB) by almost 29% compared to benchmark algorithms.
Lossy Cooperative UAV Relaying Networks: Outage Probability Analysis and Location Optimization
In this paper, performance of a lossy cooperative unmanned aerial vehicle (UAV) relay communication system is analyzed. In this system, the UAV relay adopts lossy forward (LF) strategy and the receiver has certain distortion requirements for the received information. For the system described above, we first derive the achievable rate distortion region of the system. Then, on the basis of the region analysis, the system outage probability when the channel suffers Nakagami-$m$ fading is analyzed. Finally, we design an optimal relay position identification algorithm based on the Soft Actor-Critic (SAC) algorithm, which determines the optimal UAV position to minimize the outage probability. The simulation results show that the proposed algorithm can optimize the UAV position and reduce the system outage probability effectively.
Safety Verification of Stochastic Systems: A Set-Erosion Approach
We study the safety verification problem for discrete-time stochastic systems. We propose an approach for safety verification termed set-erosion strategy that verifies the safety of a stochastic system on a safe set through the safety of its associated deterministic system on an eroded subset. The amount of erosion is captured by the probabilistic bound on the distance between stochastic trajectories and their associated deterministic counterpart. Building on our recent work [1], we establish a sharp probabilistic bound on this distance. Combining this bound with the set-erosion strategy, we establish a general framework for the safety verification of stochastic systems. Our method is flexible and can work effectively with any deterministic safety verification techniques. We exemplify our method by incorporating barrier functions designed for deterministic safety verification, obtaining barrier certificates much tighter than existing results. Numerical experiments are conducted to demonstrate the efficacy and superiority of our method.
Safe Navigation in Unmapped Environments for Robotic Systems with Input Constraints
This paper presents an approach for navigation and control in unmapped environments under input and state constraints using a composite control barrier function (CBF). We consider the scenario where real-time perception feedback (e.g., LiDAR) is used online to construct a local CBF that models local state constraints (e.g., local safety constraints such as obstacles) in the a priori unmapped environment. The approach employs a soft-maximum function to synthesize a single time-varying CBF from the N most recently obtained local CBFs. Next, the input constraints are transformed into controller-state constraints through the use of control dynamics. Then, we use a soft-minimum function to compose the input constraints with the time-varying CBF that models the a priori unmapped environment. This composition yields a single relaxed CBF, which is used in a constrained optimization to obtain an optimal control that satisfies the state and input constraints. The approach is validated through simulations of a nonholonomic ground robot that is equipped with LiDAR and navigates an unmapped environment. The robot successfully navigates the environment while avoiding the a priori unmapped obstacles and satisfying both speed and input constraints.
comment: Preprint submitted to 2025 American Control Conference (ACC). arXiv admin note: substantial text overlap with arXiv:2409.01458
Information-Driven Search and Track of Novel Space Objects
Space surveillance depends on efficiently directing sensor resources to maintain custody of known catalog objects. However, it remains unclear how to best utilize these resources to rapidly search for and track newly detected space objects. Provided a novel measurement, a search set can be instantiated through admissible region constraints to inform follow-up observations. In lacking well-constrained bounds, this set rapidly spreads in the along-track direction, growing much larger than a follow-up sensor's finite field of view. Moreover, the number of novel objects may be uncertain, and follow-up observations are most commonly corrupted by false positives from known catalog objects and missed detections. In this work, we address these challenges through the introduction of a joint sensor control and multi-target tracking approach. The search set associated to a novel measurement is represented by a Cardinalized Probability Hypothesis Density (CPHD), which jointly tracks the state uncertainty associated to a set of objects and a probability mass function for the true target number. In follow-up sensor scans, the information contained in an empty measurement set, and returns from both novel objects and known catalog objects is succinctly captured through this paradigm. To maximize the utility of a follow-up sensor, we introduce an information-driven sensor control approach for steering the instrument. Our methods are tested on two relevant test cases and we provide a comparative analysis with current naive tasking strategies.
comment: Submitted to the Journal of Astronautical Sciences
Learning Optimal Control and Dynamical Structure of Global Trajectory Search Problems with Diffusion Models
Spacecraft trajectory design is a global search problem, where previous work has revealed specific solution structures that can be captured with data-driven methods. This paper explores two global search problems in the circular restricted three-body problem: hybrid cost function of minimum fuel/time-of-flight and transfers to energy-dependent invariant manifolds. These problems display a fundamental structure either in the optimal control profile or the use of dynamical structures. We build on our prior generative machine learning framework to apply diffusion models to learn the conditional probability distribution of the search problem and analyze the model's capability to capture these structures.
comment: This paper was presented at the AAS/AIAA Astrodynamics Specialist Conference
Analyzing Fitts' Law using Offline and Online Optimal Control with Motor Noise
The cause of the speed-accuracy tradeoff (typically quantified via Fitts' Law) is a debated topic of interest in motor neuroscience, and is commonly studied using tools from control theory. Two prominent theories involve the presence of signal dependent motor noise and planning variability -- these factors are generally incorporated separately. In this work, we study how well the simultaneous presence of both factors explains the speed-accuracy tradeoff. A human arm reaching model is developed with bio-realistic signal dependent motor noise, and a Gaussian noise model is used to deterministically approximate the motor noise. Both offline trajectory optimization and online model predictive control are used to simulate the planning and execution of several different reaching tasks with varying target sizes and movement durations. These reaching trajectories are then compared to experimental human reaching data, revealing that both models produce behavior consistent with humans, and the speed-accuracy tradeoff is present in both online and offline control. These results suggest the speed-accuracy tradeoff is likely caused by a combination of these two factors, and also that it plays a role in both offline and online computation.
comment: Submitted to IEEE American Control Conference
Human Balancing on a Log: A Switched Multi-Layer Controller
We study the task of balancing a human on a log that is fixed in place. Balancing on a log is substantially more challenging than balancing on a flat surface -- to achieve stability, we use a switched multi-layer controller. The controller consists of an upper-layer LQR planner (akin to the central nervous system) that coordinates ankle and hip torques, and lower-layer PID trackers (akin to local motor units) that follow this plan subject to nonlinear dynamics. Additionally, the controller switches between three operational modes depending on the current state of the human. The efficacy of the controller is verified in simulation, where our controller is able to stabilize the human for a variety of initial conditions. We also show that this controller is compatible with muscle-based actuation and imperfect sensing, making it a promising candidate for modeling motor control under challenging conditions in a more bio-realistic way.
comment: Submitted to IEEE American Control Conference
Dissipative Avoidance Feedback for Reactive Navigation Under Second-Order Dynamics
This paper introduces DAF (Dissipative Avoidance Feedback), a novel approach for autonomous robot navigation in unknown, obstacle-filled environments with second-order dynamics. Unlike traditional APF (Artificial Potential Field) methods, which rely on repulsive forces based solely on position, DAF employs a dissipative feedback mechanism that adjusts the robot's motion in response to both its position and velocity, ensuring smoother, more natural obstacle avoidance. The proposed continuously differentiable controller solves the motion-to-goal problem while guaranteeing collision-free navigation by considering the robot's state and local obstacle distance information. We show that the controller guarantees safe navigation in generic $n$-dimensional environments and that all undesired $\omega$-limit points are unstable under certain \textit{controlled} curvature conditions. Designed for real-time implementation, DAF requires only locally measured data from limited-range sensors (e.g., LiDAR, depth cameras), making it particularly effective for robots navigating unknown workspaces.
comment: 7 pages, 7 figures
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy. We prove that this method results in the maximal robust RAS set in the absence of training errors and demonstrate that it enables RAS in complex environments, scales to high-dimensional systems, and achieves higher success rates for the RAS task compared to previous methods, validated through one simulation and two high-dimensional experiments.
Approximation Schemes for POMPDs with Continuous Spaces and Their Near Optimality
We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applications, due to the state space being lifted to the space of probability measures. Generalizing recent work, in this paper we present rigorous approximation methods via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions. We show that the resulting policy is near-optimal under some regularity assumptions on the channel, and under certain controlled filter stability requirements for the hidden state process. Furthermore, by quantizing the measurements, we are able to utilize refined filter stability conditions. We also provide a Q learning algorithm that uses a finite memory of discretized information variables, and prove its convergence to the optimality equation of the finite fully observed MDP constructed using the approximation method.
Gait Optimization for Legged Systems Through Mixed Distribution Cross-Entropy Optimization
Legged robotic systems can play an important role in real-world applications due to their superior load-bearing capabilities, enhanced autonomy, and effective navigation on uneven terrain. They offer an optimal trade-off between mobility and payload capacity, excelling in diverse environments while maintaining efficiency in transporting heavy loads. However, planning and optimizing gaits and gait sequences for these robots presents significant challenges due to the complexity of their dynamic motion and the numerous optimization variables involved. Traditional trajectory optimization methods address these challenges by formulating the problem as an optimization task, aiming to minimize cost functions, and to automatically discover contact sequences. Despite their structured approach, optimization-based methods face substantial difficulties, particularly because such formulations result in highly nonlinear and difficult to solve problems. To address these limitations, we propose CrEGOpt, a bi-level optimization method that combines traditional trajectory optimization with a black-box optimization scheme. CrEGOpt at the higher level employs the Mixed Distribution Cross-Entropy Method to optimize both the gait sequence and the phase durations, thus simplifying the lower level trajectory optimization problem. This approach allows for fast solutions of complex gait optimization problems. Extensive evaluation in simulated environments demonstrates that CrEGOpt can find solutions for biped, quadruped, and hexapod robots in under 10 seconds. This novel bi-level optimization scheme offers a promising direction for future research in automatic contact scheduling.
comment: 8 pages, 7 figures, Accepted at Humanoids 2024
Absolute centrality in a signed Friedkin-Johnsen based model: a graphical characterisation of influence
This paper studies the evolution of opinions governed by a Friedkin Johnsen (FJ) based model in arbitrary network structures with signed interactions. The agents contributing to the opinion formation are characterised as being influential. Initially, the agents are classified as opinion leaders and followers based on network connectivity and the nature of interactions. However, the addition of stubbornness leads to interesting behaviours wherein a non influential agent can now become influential and vice versa. Thereafter, a signal flow graph (SFG) based method is proposed to quantify the influence of an influential agents' opinions. Additionally, it helps illustrate the role played by network topology in shaping the final opinions of the agents. Based on this analysis, the absolute centrality measure is proposed to determine the overall influence of all the agents in the network. Unlike most of the existing measures, it is applicable to any network structure and considers the effect of stubbornness and antagonism. Examples are presented throughout the paper to illustrate and validate these results.
comment: 13 pages
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
Two-stage adaptive robust optimization (ARO) is a powerful approach for planning under uncertainty, balancing first-stage decisions with recourse decisions made after uncertainty is realized. To account for uncertainty, modelers typically define a simple uncertainty set over which potential outcomes are considered. However, classical methods for defining these sets unintentionally capture a wide range of unrealistic outcomes, resulting in overly-conservative and costly planning in anticipation of unlikely contingencies. In this work, we introduce AGRO, a solution algorithm that performs adversarial generation for two-stage adaptive robust optimization using a variational autoencoder. AGRO generates high-dimensional contingencies that are simultaneously adversarial and realistic, improving the robustness of first-stage decisions at a lower planning cost than standard methods. To ensure generated contingencies lie in high-density regions of the uncertainty distribution, AGRO defines a tight uncertainty set as the image of "latent" uncertainty sets under the VAE decoding transformation. Projected gradient ascent is then used to maximize recourse costs over the latent uncertainty sets by leveraging differentiable optimization methods. We demonstrate the cost-efficiency of AGRO by applying it to both a synthetic production-distribution problem and a real-world power system expansion setting. We show that AGRO outperforms the standard column-and-constraint algorithm by up to 1.8% in production-distribution planning and up to 11.6% in power system expansion.
Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants
Transformers are crucial for reliable and efficient power system operations, particularly in supporting the integration of renewable energy. Effective monitoring of transformer health is critical to maintain grid stability and performance. Thermal insulation ageing is a key transformer failure mode, which is generally tracked by monitoring the hotspot temperature (HST). However, HST measurement is complex, costly, and often estimated from indirect measurements. Existing HST models focus on space-agnostic thermal models, providing worst-case HST estimates. This article introduces a spatio-temporal model for transformer winding temperature and ageing estimation, which leverages physics-based partial differential equations (PDEs) with data-driven Neural Networks (NN) in a Physics Informed Neural Networks (PINNs) configuration to improve prediction accuracy and acquire spatio-temporal resolution. The computational accuracy of the PINN model is improved through the implementation of the Residual-Based Attention (PINN-RBA) scheme that accelerates the PINN model convergence. The PINN-RBA model is benchmarked against self-adaptive attention schemes and classical vanilla PINN configurations. For the first time, PINN based oil temperature predictions are used to estimate spatio-temporal transformer winding temperature values, validated through PDE numerical solution and fiber optic sensor measurements. Furthermore, the spatio-temporal transformer ageing model is inferred, which supports transformer health management decision-making. Results are validated with a distribution transformer operating on a floating photovoltaic power plant.
comment: 23 pages, 18 figures
Solution of the Probabilistic Lambert Problem: Connections with Optimal Mass Transport, Schrödinger Bridge and Reaction-Diffusion PDEs
The Lambert problem originated in orbital mechanics. It concerns with determining the initial velocity for a boundary value problem involving the dynamical constraint due to gravitational potential with additional time horizon and endpoint position constraints. Its solution has application in transferring a spacecraft from a given initial to a given terminal position within prescribed flight time via velocity control. We consider a probabilistic variant of the Lambert problem where the knowledge of the endpoint constraints in position vectors are replaced by the knowledge of their respective joint probability density functions. We show that the Lambert problem with endpoint joint probability density constraints is a generalized optimal mass transport (OMT) problem, thereby connecting this classical astrodynamics problem with a burgeoning area of research in modern stochastic control and stochastic machine learning. This newfound connection allows us to rigorously establish the existence and uniqueness of solution for the probabilistic Lambert problem. The same connection also helps to numerically solve the probabilistic Lambert problem via diffusion regularization, i.e., by leveraging further connection of the OMT with the Schr\"odinger bridge problem (SBP). This also shows that the probabilistic Lambert problem with additive dynamic process noise is a generalized SBP, and can be solved numerically using the so-called Schr\"odinger factors, as we do in this work. Our analysis leads to solving a system of reaction-diffusion PDEs where the gravitational potential appears as the reaction rate.
Feedback Linearizable Discretizations of Second Order Mechanical Systems using Retraction Maps
Mechanical systems are most often described by a set of continuous-time, nonlinear, second-order differential equations (SODEs) of a particular structure governed by the covariant derivative. The digital implementation of controllers for such systems requires a discrete model of the system and hence requires numerical discretization schemes. Feedback linearizability of such sampled systems, however, depends on the discretization scheme employed. In this article, we utilize retraction maps and their lifts to construct feedback linearizable discretizations for SODEs which can be applied to many mechanical systems.
Identification For Control Based on Neural Networks: Approximately Linearizable Models
This work presents a control-oriented identification scheme for efficient control design and stability analysis of nonlinear systems. Neural networks are used to identify a discrete-time nonlinear state-space model to approximate time-domain input-output behavior of a nonlinear system. The network is constructed such that the identified model is approximately linearizable by feedback, ensuring that the control law trivially follows from the learning stage. After the identification and quasi-linearization procedures, linear control theory comes at hand to design robust controllers and study stability of the closed-loop system. The effectiveness and interest of the methodology are illustrated throughout the paper on popular benchmarks for system identification.
comment: 15 pages, 3 figures, 6 tables, accepted as a poster in SysDO 2024, Stuttgart, Germany
An Artificial Neural Network based approach for Harmonic Component Prediction in a Distribution Line
With the increasing use of nonlinear devices in both generation and consumption of power, it is essential that we develop accurate and quick control for active filters to suppress harmonics. Time delays between input and output are catastrophic for such filters which rely on real-time operation. Artificial Neural Networks (ANNs) are capable of modeling complex nonlinear systems through adjustments in their learned parameters. Once properly trained, they can produce highly accurate predictions at an instantaneous time frame. Leveraging these qualities, various complex control systems may be replaced or aided by neural networks to provide quick and precise responses. This paper proposes an ANN-based approach for the prediction of individual harmonic components using minimal inputs. By extracting and analyzing the nature of harmonic component magnitudes obtained from the survey of a particular area through real-time measurements, a sequential pattern in their occurrence is observed. Various neural network architectures are trained using the collected data and their performances are evaluated. The best-performing model, whose losses are minimal, is then used to observe the harmonic cancellation for multiple unseen cases through a simplified simulation in hardware-in-the-loop. These neural network structures, which produce instantaneous and accurate outputs, are effective in harmonic filtering.
Closed-Loop Sensitivity Identification for Cross-Directional Systems
At Diamond Light Source, the UK's national synchrotron facility, electron beam disturbances are attenuated by the fast orbit feedback (FOFB), which controls a cross-directional (CD) system with hundreds of inputs and outputs. Due to the inability to measure the disturbances in real-time, the closed-loop sensitivity of the FOFB can only be evaluated indirectly, making it difficult to compare FOFB algorithms and detect faults. Existing methods rely on comparing open-loop with closed-loop measurements, but they are prone to instabilities and actuator saturation because of the system's strong directionality. Here, we introduce a reference signal to estimate the complementary sensitivity in closed loop. By decoupling the system into sets of single-input, single-output (SISO) systems, the reference signal is designed mode-by-mode, accommodating the system's strong directionality. Additionally, a lower bound on the reference amplitude is derived to limit the estimation error in the presence of disturbances and measurement noise. This method enables the use of SISO system identification techniques, making it suitable for large-scale systems. It not only facilitates performance estimation of ill-conditioned CD systems in closed-loop but also provides a signal for fault detection. The potential applications of this approach extend to other CD systems, such as papermaking, steel rolling, or battery manufacturing processes.
Stable Reduced-Rank VAR Identification
The vector autoregression (VAR) has been widely used in system identification, econometrics, natural science, and many other areas. However, when the state dimension becomes large the parameter dimension explodes. So rank reduced modelling is attractive and is well developed. But a fundamental requirement in almost all applications is stability of the fitted model. And this has not been addressed in the rank reduced case. Here, we develop, for the first time, a closed-form formula for an estimator of a rank reduced transition matrix which is guaranteed to be stable. We show that our estimator is consistent and asymptotically statistically efficient and illustrate it in comparative simulations.
comment: 17 pages, 6 figures
Data-driven distributionally robust MPC for systems with multiplicative noise: A semi-infinite semi-definite programming approach
This article introduces a novel distributionally robust model predictive control (DRMPC) algorithm for a specific class of controlled dynamical systems where the disturbance multiplies the state and control variables. These classes of systems arise in mathematical finance, where the paradigm of distributionally robust optimization (DRO) fits perfectly, and this serves as the primary motivation for this work. We recast the optimal control problem (OCP) as a semi-definite program with an infinite number of constraints, making the ensuing optimization problem a \emph{semi-infinite semi-definite program} (SI-SDP). To numerically solve the SI-SDP, we advance an approach for solving convex semi-infinite programs (SIPs) to SI-SDPs and, subsequently, solve the DRMPC problem. A numerical example is provided to show the effectiveness of the algorithm.
comment: To appear in the proceedings of Mathematical Theory of Networks and Systems (MTNS) 2024
Learning Chaotic Dynamics with Embedded Dissipativity
Chaotic dynamics, commonly seen in weather systems and fluid turbulence, are characterized by their sensitivity to initial conditions, which makes accurate prediction challenging. Despite its sensitivity to initial perturbations, many chaotic systems observe dissipative behaviors and ergodicity. Therefore, recently various approaches have been proposed to develop data-driven models preserving invariant statistics over long horizons. Although these methods have shown empirical success in reducing instances of unbounded trajectory generation, many of the models are still prone to generating unbounded trajectories, leading to invalid statistics evaluation. In this paper, we propose a novel neural network architecture that simultaneously learns a dissipative dynamics emulator that guarantees to generate bounded trajectories and an energy-like function that governs the dissipative behavior. More specifically, by leveraging control-theoretic ideas, we derive algebraic conditions based on the learned energy-like function that ensure asymptotic convergence to an invariant level set. Using these algebraic conditions, our proposed model enforces dissipativity through a ReLU projection layer, which provides formal trajectory boundedness guarantees. Furthermore, the invariant level set provides an outer estimate for the strange attractor, which is known to be very difficult to characterize due to its complex geometry. We demonstrate the capability of our model in producing bounded long-horizon trajectory forecasts and characterizing the attractor for chaotic dynamical systems including Lorenz 96 and a truncated Kuramoto-Sivashinsky equation.
Understanding the Impact of Coalitions between EV Charging Stations
The rapid growth of electric vehicles (EVs) is driving the expansion of charging infrastructure globally. As charging stations become ubiquitous, their substantial electricity consumption can influence grid operation and electricity pricing. Naturally, \textit{some} groups of charging stations, which could be jointly operated by a company, may coordinate to decide their charging profile. While coordination among all charging stations is ideal, it is unclear if coordination of some charging stations is better than no coordination. In this paper, we analyze this intermediate regime between no and full coordination of charging stations. We model EV charging as a non-cooperative aggregative game, where each station's cost is determined by both monetary payments tied to reactive electricity prices on the grid and its sensitivity to deviations from a desired charging profile. We consider a solution concept that we call $\mathcal{C}$-Nash equilibrium, which is tied to a coalition $\mathcal{C}$ of charging stations coordinating to reduce their costs. We provide sufficient conditions, in terms of the demand and sensitivity of charging stations, to determine when independent (aka uncoordinated) operation of charging stations could result in lower overall costs to charging stations, coalition and charging stations outside the coalition. Somewhat counter to common intuition, we show numerical instances where allowing charging stations to operate independently is better than coordinating a subset of stations as a coalition. Jointly, these results provide operators of charging stations insights into how to coordinate their charging behavior, and open several research directions.
comment: 20 pages, 5 figures
BVE + EKF: A viewpoint estimator for the estimation of the object's position in the 3D task space using Extended Kalman Filters
RGB-D sensors face multiple challenges operating under open-field environments because of their sensitivity to external perturbations such as radiation or rain. Multiple works are approaching the challenge of perceiving the 3D position of objects using monocular cameras. However, most of these works focus mainly on deep learning-based solutions, which are complex, data-driven, and difficult to predict. So, we aim to approach the problem of predicting the 3D objects' position using a Gaussian viewpoint estimator named best viewpoint estimator (BVE) powered by an extended Kalman filter (EKF). The algorithm proved efficient on the tasks and reached a maximum average Euclidean error of about 32 mm. The experiments were deployed and evaluated in MATLAB using artificial Gaussian noise. Future work aims to implement the system in a robotic system.
comment: Accepted to ICINCO - 21st International Conference on Informatics in Control, Automation and Robotics
ERIC: Estimating Rainfall with Commodity Doorbell Camera for Precision Residential Irrigation
Current state-of-the-art residential irrigation systems, such as WaterMyYard, rely on rainfall data from nearby weather stations to adjust irrigation amounts. However, the accuracy of rainfall data is compromised by the limited spatial resolution of rain gauges and the significant variability of hyperlocal rainfall, leading to substantial water waste. To improve irrigation efficiency, we developed a cost-effective irrigation system, dubbed ERIC, which employs machine learning models to estimate rainfall from commodity doorbell camera footage and optimizes irrigation schedules without human intervention. Specifically, we: a) designed novel visual and audio features with lightweight neural network models to infer rainfall from the camera at the edge, preserving user privacy; b) built a complete end-to-end irrigation system on Raspberry Pi 4, costing only \$75. We deployed the system across five locations (collecting over 750 hours of video) with varying backgrounds and light conditions. Comprehensive evaluation validates that ERIC achieves state-of-the-art rainfall estimation performance ($\sim$ 5mm/day), saving 9,112 gallons/month of water, translating to \$28.56/month in utility savings. Data and code are available at https://github.com/LENSS/ERIC-BuildSys2024.git
comment: BuildSys 2024
Second-Order Algorithms for Finding Local Nash Equilibria in Zero-Sum Games
Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors. To overcome this challenge, algorithms must account for subtleties involving the curvatures of players' costs. To this end, we leverage dynamical system theory and develop a second-order algorithm for finding a local Nash equilibrium in the smooth, possibly nonconvex-nonconcave, zero-sum game setting. First, we prove that this novel method guarantees convergence to only local Nash equilibria with a local linear convergence rate. We then interpret a version of this method as a modified Gauss-Newton algorithm with local superlinear convergence to the neighborhood of a point that satisfies first-order local Nash equilibrium conditions. In comparison, current related state-of-the-art methods do not offer convergence rate guarantees. Furthermore, we show that this approach naturally generalizes to settings with convex and potentially coupled constraints while retaining earlier guarantees of convergence to only local (generalized) Nash equilibria.
Hybrid Feedback for Three-dimensional Convex Obstacle Avoidance (Extended version)
We propose a hybrid feedback control scheme for the autonomous robot navigation problem in three-dimensional environments with arbitrarily-shaped convex obstacles. The proposed hybrid control strategy, which consists in switching between the move-to-target mode and the obstacle-avoidance mode, guarantees global asymptotic stability of the target location in the obstacle-free workspace. We also provide a procedure for the implementation of the proposed hybrid controller in a priori unknown environments and validate its effectiveness through simulation results.
comment: 13 pages, 6 figures
Synthesis of General Decoupling Networks Using Transmission Lines
In this paper, we introduce a synthesis technique for transmission line based decoupling networks, which find application in coupled systems such as multiple-antenna systems and compact antenna arrays. Employing the generalized $\pi$-network and the transmission line analysis technique, we reduce the decoupling network design into simple matrix calculations. The synthesized decoupling network is essentially a generalized $\pi$-network with transmission lines at all branches. A standard electrical length of $3\lambda/8$ and $5\lambda/8$ are chosen to simplify the physical implementation, leaving the characteristic impedances of the transmission line branches the main design parameters. The advantage of this proposed decoupling network is that it can be implemented using transmission lines, ensuring better control on loss, performance consistency and higher power handling capability when compared with lumped components, and can be easily scaled for operation at different frequencies. A two-port microstrip antenna system at 1.2 GHz and a three-port monopole antenna system at 1 GHz are investigated respectively to demonstrate the validity of the proposed synthesis method, and perfect decoupling ($S_{21}<-50$dB) are achieved at both design frequencies.
comment: 5 pages
Robotics
Grounding Large Language Models In Embodied Environment With Imperfect World Models
Despite a widespread success in various applications, large language models (LLMs) often stumble when tackling basic physical reasoning or executing robotics tasks, due to a lack of direct experience with the physical nuances of the real world. To address these issues, we propose a Grounding Large language model with Imperfect world MOdel (GLIMO), which utilizes proxy world models such as simulators to collect and synthesize trining data. GLIMO incorporates an LLM agent-based data generator to automatically create high-quality and diverse instruction datasets. The generator includes an iterative self-refining module for temporally consistent experience sampling, a diverse set of question-answering instruction seeds, and a retrieval-augmented generation module for reflecting on prior experiences. Comprehensive experiments show that our approach improve the performance of strong open-source LLMs like LLaMA-3 with a performance boost of 2.04 $\times$, 1.54 $\times$, and 1.82 $\times$ across three different benchmarks, respectively. The performance is able to compete with or surpass their larger counterparts such as GPT-4.
Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments
Navigating complex environments requires Unmanned Aerial Vehicles (UAVs) and autonomous systems to perform trajectory tracking and obstacle avoidance in real-time. While many control strategies have effectively utilized linear approximations, addressing the non-linear dynamics of UAV, especially in obstacle-dense environments, remains a key challenge that requires further research. This paper introduces a Non-linear Model Predictive Control (NMPC) framework for the DJI Matrice 100, addressing these challenges by using a dynamic model and B-spline interpolation for smooth reference trajectories, ensuring minimal deviation while respecting safety constraints. The framework supports various trajectory types and employs a penalty-based cost function for control accuracy in tight maneuvers. The framework utilizes CasADi for efficient real-time optimization, enabling the UAV to maintain robust operation even under tight computational constraints. Simulation and real-world indoor and outdoor experiments demonstrated the NMPC ability to adapt to disturbances, resulting in smooth, collision-free navigation.
comment: This manuscript has 7 pages and 8 figures, detailing NMPC for UAV obstacle avoidance using DJI UAVs. It features simulations, experimental results, and uses CasADi for optimization with ROS integration. Code and media at https://github.com/larasupernovae/nmpc_flash_multi_obstacle
DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
Object navigation in unknown environments is crucial for deploying embodied agents in real-world applications. While we have witnessed huge progress due to large-scale scene datasets, faster simulators, and stronger models, previous studies mainly focus on limited scene types and target objects. In this paper, we study a new task of navigating to diverse target objects in a large number of scene types. To benchmark the problem, we present a large-scale scene dataset, DivScene, which contains 4,614 scenes across 81 different types. With the dataset, we build an end-to-end embodied agent, NatVLM, by fine-tuning a Large Vision Language Model (LVLM) through imitation learning. The LVLM is trained to take previous observations from the environment and generate the next actions. We also introduce CoT explanation traces of the action prediction for better performance when tuning LVLMs. Our extensive experiments find that we can build a performant LVLM-based agent through imitation learning on the shortest paths constructed by a BFS planner without any human supervision. Our agent achieves a success rate that surpasses GPT-4o by over 20%. Meanwhile, we carry out various analyses showing the generalization ability of our agent.
comment: Work in Progress
Why Sample Space Matters: Keyframe Sampling Optimization for LiDAR-based Place Recognition
Recent advances in robotics are pushing real-world autonomy, enabling robots to perform long-term and large-scale missions. A crucial component for successful missions is the incorporation of loop closures through place recognition, which effectively mitigates accumulated pose estimation drift. Despite computational advancements, optimizing performance for real-time deployment remains challenging, especially in resource-constrained mobile robots and multi-robot systems since, conventional keyframe sampling practices in place recognition often result in retaining redundant information or overlooking relevant data, as they rely on fixed sampling intervals or work directly in the 3D space instead of the feature space. To address these concerns, we introduce the concept of sample space in place recognition and demonstrate how different sampling techniques affect the query process and overall performance. We then present a novel keyframe sampling approach for LiDAR-based place recognition, which focuses on redundancy minimization and information preservation in the hyper-dimensional descriptor space. This approach is applicable to both learning-based and handcrafted descriptors, and through the experimental validation across multiple datasets and descriptor frameworks, we demonstrate the effectiveness of our proposed method, showing it can jointly minimize redundancy and preserve essential information in real-time. The proposed approach maintains robust performance across various datasets without requiring parameter tuning, contributing to more efficient and reliable place recognition for a wide range of robotic applications.
comment: 20 pages, 15 figures. Submitted
Extremum Seeking Controlled Wiggling for Tactile Insertion
When humans perform insertion tasks such as inserting a cup into a cupboard, routing a cable, or key insertion, they wiggle the object and observe the process through tactile and proprioceptive feedback. While recent advances in tactile sensors have resulted in tactile-based approaches, there has not been a generalized formulation based on wiggling similar to human behavior. Thus, we propose an extremum-seeking control law that can insert four keys into four types of locks without control parameter tuning despite significant variation in lock type. The resulting model-free formulation wiggles the end effector pose to maximize insertion depth while minimizing strain as measured by a GelSight Mini tactile sensor that grasps a key. The algorithm achieves a 71\% success rate over 120 randomly initialized trials with uncertainty in both translation and orientation. Over 240 deterministically initialized trials, where only one translation or rotation parameter is perturbed, 84\% of trials succeeded. Given tactile feedback at 13 Hz, the mean insertion time for these groups of trials are 262 and 147 seconds respectively.
comment: 7 pages, 5 figures, 3 tables
SwarmCVT: Centroidal Voronoi Tessellation-Based Path Planning for Very-Large-Scale Robotics
Swarm robotics, or very large-scale robotics (VLSR), has many meaningful applications for complicated tasks. However, the complexity of motion control and energy costs stack up quickly as the number of robots increases. In addressing this problem, our previous studies have formulated various methods employing macroscopic and microscopic approaches. These methods enable microscopic robots to adhere to a reference Gaussian mixture model (GMM) distribution observed at the macroscopic scale. As a result, optimizing the macroscopic level will result in an optimal overall result. However, all these methods require systematic and global generation of Gaussian components (GCs) within obstacle-free areas to construct the GMM trajectories. This work utilizes centroidal Voronoi tessellation to generate GCs methodically. Consequently, it demonstrates performance improvement while also ensuring consistency and reliability.
comment: Submitted to American Control Conference (ACC) 2025
Cross-Embodiment Dexterous Grasping with Reinforcement Learning
Dexterous hands exhibit significant potential for complex real-world grasping tasks. While recent studies have primarily focused on learning policies for specific robotic hands, the development of a universal policy that controls diverse dexterous hands remains largely unexplored. In this work, we study the learning of cross-embodiment dexterous grasping policies using reinforcement learning (RL). Inspired by the capability of human hands to control various dexterous hands through teleoperation, we propose a universal action space based on the human hand's eigengrasps. The policy outputs eigengrasp actions that are then converted into specific joint actions for each robot hand through a retargeting mapping. We simplify the robot hand's proprioception to include only the positions of fingertips and the palm, offering a unified observation space across different robot hands. Our approach demonstrates an 80% success rate in grasping objects from the YCB dataset across four distinct embodiments using a single vision-based policy. Additionally, our policy exhibits zero-shot generalization to two previously unseen embodiments and significant improvement in efficient finetuning. For further details and videos, visit our project page https://sites.google.com/view/crossdex.
Learning Diverse Bimanual Dexterous Manipulation Skills from Human Demonstrations
Bimanual dexterous manipulation is a critical yet underexplored area in robotics. Its high-dimensional action space and inherent task complexity present significant challenges for policy learning, and the limited task diversity in existing benchmarks hinders general-purpose skill development. Existing approaches largely depend on reinforcement learning, often constrained by intricately designed reward functions tailored to a narrow set of tasks. In this work, we present a novel approach for efficiently learning diverse bimanual dexterous skills from abundant human demonstrations. Specifically, we introduce BiDexHD, a framework that unifies task construction from existing bimanual datasets and employs teacher-student policy learning to address all tasks. The teacher learns state-based policies using a general two-stage reward function across tasks with shared behaviors, while the student distills the learned multi-task policies into a vision-based policy. With BiDexHD, scalable learning of numerous bimanual dexterous skills from auto-constructed tasks becomes feasible, offering promising advances toward universal bimanual dexterous manipulation. Our empirical evaluation on the TACO dataset, spanning 141 tasks across six categories, demonstrates a task fulfillment rate of 74.59% on trained tasks and 51.07% on unseen tasks, showcasing the effectiveness and competitive zero-shot generalization capabilities of BiDexHD. For videos and more information, visit our project page https://sites.google.com/view/bidexhd.
Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping
Universal dexterous grasping across diverse objects presents a fundamental yet formidable challenge in robot learning. Existing approaches using reinforcement learning (RL) to develop policies on extensive object datasets face critical limitations, including complex curriculum design for multi-task learning and limited generalization to unseen objects. To overcome these challenges, we introduce ResDex, a novel approach that integrates residual policy learning with a mixture-of-experts (MoE) framework. ResDex is distinguished by its use of geometry-unaware base policies that are efficiently acquired on individual objects and capable of generalizing across a wide range of unseen objects. Our MoE framework incorporates several base policies to facilitate diverse grasping styles suitable for various objects. By learning residual actions alongside weights that combine these base policies, ResDex enables efficient multi-task RL for universal dexterous grasping. ResDex achieves state-of-the-art performance on the DexGraspNet dataset comprising 3,200 objects with an 88.8% success rate. It exhibits no generalization gap with unseen objects and demonstrates superior training efficiency, mastering all tasks within only 12 hours on a single GPU.
Behavior Trees in Functional Safety Supervisors for Autonomous Vehicles
The rapid advancements in autonomous vehicle software present both opportunities and challenges, especially in enhancing road safety. The primary objective of autonomous vehicles is to reduce accident rates through improved safety measures. However, the integration of new algorithms into the autonomous vehicle, such as Artificial Intelligence methods, raises concerns about the compliance with established safety regulations. This paper introduces a novel software architecture based on behavior trees, aligned with established standards and designed to supervise vehicle functional safety in real time. It specifically addresses the integration of algorithms into industrial road vehicles, adhering to the ISO 26262. The proposed supervision methodology involves the detection of hazards and compliance with functional and technical safety requirements when a hazard arises. This methodology, implemented in this study in a Renault M\'egane (currently at SAE level 3 of automation), not only guarantees compliance with safety standards, but also paves the way for safer and more reliable autonomous driving technologies.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks
Safe and successful deployment of robots requires not only the ability to generate complex plans but also the capacity to frequently replan and correct execution errors. This paper addresses the challenge of long-horizon trajectory planning under temporally extended objectives in a receding horizon manner. To this end, we propose DOPPLER, a data-driven hierarchical framework that generates and updates plans based on instruction specified by linear temporal logic (LTL). Our method decomposes temporal tasks into chain of options with hierarchical reinforcement learning from offline non-expert datasets. It leverages diffusion models to generate options with low-level actions. We devise a determinantal-guided posterior sampling technique during batch generation, which improves the speed and diversity of diffusion generated options, leading to more efficient querying. Experiments on robot navigation and manipulation tasks demonstrate that DOPPLER can generate sequences of trajectories that progressively satisfy the specified formulae for obstacle avoidance and sequential visitation. Demonstration videos are available online at: https://philiptheother.github.io/doppler/.
Coastal Underwater Evidence Search System with Surface-Underwater Collaboration
The Coastal underwater evidence search system with surface-underwater collaboration is designed to revolutionize the search for artificial objects in coastal underwater environments, overcoming limitations associated with traditional methods such as divers and tethered remotely operated vehicles. Our innovative multi-robot collaborative system consists of three parts, an autonomous surface vehicle as a mission control center, a towed underwater vehicle for wide-area search, and a biomimetic underwater robot inspired by marine organisms for detailed inspections of identified areas. We conduct extensive simulations and real-world experiments in pond environments and coastal fields to demonstrate the system potential to surpass the limitations of conventional underwater search methods, offering a robust and efficient solution for law enforcement and recovery operations in marine settings.
comment: This paper has been accepted by the 18th International Conference on Control, Automation, Robotics and Vision (ICARCV)
Data Optimisation of Machine Learning Models for Smart Irrigation in Urban Parks
Urban environments face significant challenges due to climate change, including extreme heat, drought, and water scarcity, which impact public health, community well-being, and local economies. Effective management of these issues is crucial, particularly in areas like Sydney Olympic Park, which relies on one of Australia's largest irrigation systems. The Smart Irrigation Management for Parks and Cool Towns (SIMPaCT) project, initiated in 2021, leverages advanced technologies and machine learning models to optimize irrigation and induce physical cooling. This paper introduces two novel methods to enhance the efficiency of the SIMPaCT system's extensive sensor network and applied machine learning models. The first method employs clustering of sensor time series data using K-shape and K-means algorithms to estimate readings from missing sensors, ensuring continuous and reliable data. This approach can detect anomalies, correct data sources, and identify and remove redundant sensors to reduce maintenance costs. The second method involves sequential data collection from different sensor locations using robotic systems, significantly reducing the need for high numbers of stationary sensors. Together, these methods aim to maintain accurate soil moisture predictions while optimizing sensor deployment and reducing maintenance costs, thereby enhancing the efficiency and effectiveness of the smart irrigation system. Our evaluations demonstrate significant improvements in the efficiency and cost-effectiveness of soil moisture monitoring networks. The cluster-based replacement of missing sensors provides up to 5.4% decrease in average error. The sequential sensor data collection as a robotic emulation shows 17.2% and 2.1% decrease in average error for circular and linear paths respectively.
QDGset: A Large Scale Grasping Dataset Generated with Quality-Diversity
Recent advances in AI have led to significant results in robotic learning, but skills like grasping remain partially solved. Many recent works exploit synthetic grasping datasets to learn to grasp unknown objects. However, those datasets were generated using simple grasp sampling methods using priors. Recently, Quality-Diversity (QD) algorithms have been proven to make grasp sampling significantly more efficient. In this work, we extend QDG-6DoF, a QD framework for generating object-centric grasps, to scale up the production of synthetic grasping datasets. We propose a data augmentation method that combines the transformation of object meshes with transfer learning from previous grasping repertoires. The conducted experiments show that this approach reduces the number of required evaluations per discovered robust grasp by up to 20%. We used this approach to generate QDGset, a dataset of 6DoF grasp poses that contains about 3.5 and 4.5 times more grasps and objects, respectively, than the previous state-of-the-art. Our method allows anyone to easily generate data, eventually contributing to a large-scale collaborative dataset of synthetic grasps.
comment: 8 pages, 9 figures. Draft version
Semantic Communication and Control Co-Design for Multi-Objective Correlated Dynamics
This letter introduces a machine-learning approach to learning the semantic dynamics of correlated systems with different control rules and dynamics. By leveraging the Koopman operator in an autoencoder (AE) framework, the system's state evolution is linearized in the latent space using a dynamic semantic Koopman (DSK) model, capturing the baseline semantic dynamics. Signal temporal logic (STL) is incorporated through a logical semantic Koopman (LSK) model to encode system-specific control rules. These models form the proposed logical Koopman AE framework that reduces communication costs while improving state prediction accuracy and control performance, showing a 91.65% reduction in communication samples and significant performance gains in simulation.
End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning
Dynamic and interactive traffic scenarios pose significant challenges for autonomous driving systems. Reinforcement learning (RL) offers a promising approach by enabling the exploration of driving policies beyond the constraints of pre-collected datasets and predefined conditions, particularly in complex environments. However, a critical challenge lies in effectively extracting spatial and temporal features from sequences of high-dimensional, multi-modal observations while minimizing the accumulation of errors over time. Additionally, efficiently guiding large-scale RL models to converge on optimal driving policies without frequent failures during the training process remains tricky. We propose an end-to-end model-based RL algorithm named Ramble to address these issues. Ramble processes multi-view RGB images and LiDAR point clouds into low-dimensional latent features to capture the context of traffic scenarios at each time step. A transformer-based architecture is then employed to model temporal dependencies and predict future states. By learning a dynamics model of the environment, Ramble can foresee upcoming traffic events and make more informed, strategic decisions. Our implementation demonstrates that prior experience in feature extraction and decision-making plays a pivotal role in accelerating the convergence of RL models toward optimal driving policies. Ramble achieves state-of-the-art performance regarding route completion rate and driving score on the CARLA Leaderboard 2.0, showcasing its effectiveness in managing complex and dynamic traffic situations.
comment: 10 pages, 3 figures, experiment under progress, only to demonstrate the originality of the method
Capturing complex hand movements and object interactions using machine learning-powered stretchable smart textile gloves
Accurate real-time tracking of dexterous hand movements and interactions has numerous applications in human-computer interaction, metaverse, robotics, and tele-health. Capturing realistic hand movements is challenging because of the large number of articulations and degrees of freedom. Here, we report accurate and dynamic tracking of articulated hand and finger movements using stretchable, washable smart gloves with embedded helical sensor yarns and inertial measurement units. The sensor yarns have a high dynamic range, responding to low 0.005 % to high 155 % strains, and show stability during extensive use and washing cycles. We use multi-stage machine learning to report average joint angle estimation root mean square errors of 1.21 and 1.45 degrees for intra- and inter-subjects cross-validation, respectively, matching accuracy of costly motion capture cameras without occlusion or field of view limitations. We report a data augmentation technique that enhances robustness to noise and variations of sensors. We demonstrate accurate tracking of dexterous hand movements during object interactions, opening new avenues of applications including accurate typing on a mock paper keyboard, recognition of complex dynamic and static gestures adapted from American Sign Language and object identification.
Guiding Long-Horizon Task and Motion Planning with Vision Language Models
Vision-Language Models (VLM) can generate plausible high-level plans when prompted with a goal, the context, an image of the scene, and any planning constraints. However, there is no guarantee that the predicted actions are geometrically and kinematically feasible for a particular robot embodiment. As a result, many prerequisite steps such as opening drawers to access objects are often omitted in their plans. Robot task and motion planners can generate motion trajectories that respect the geometric feasibility of actions and insert physically necessary actions, but do not scale to everyday problems that require common-sense knowledge and involve large state spaces comprised of many variables. We propose VLM-TAMP, a hierarchical planning algorithm that leverages a VLM to generate goth semantically-meaningful and horizon-reducing intermediate subgoals that guide a task and motion planner. When a subgoal or action cannot be refined, the VLM is queried again for replanning. We evaluate VLM- TAMP on kitchen tasks where a robot must accomplish cooking goals that require performing 30-50 actions in sequence and interacting with up to 21 objects. VLM-TAMP substantially outperforms baselines that rigidly and independently execute VLM-generated action sequences, both in terms of success rates (50 to 100% versus 0%) and average task completion percentage (72 to 100% versus 15 to 45%). See project site https://zt-yang.github.io/vlm-tamp-robot/ for more information.
Reducing Warning Errors in Driver Support with Personalized Risk Maps
We consider the problem of human-focused driver support. State-of-the-art personalization concepts allow to estimate parameters for vehicle control systems or driver models. However, there are currently few approaches proposed that use personalized models and evaluate the effectiveness in the form of general risk warning. In this paper, we therefore propose a warning system that estimates a personalized risk factor for the given driver based on the driver's behavior. The system afterwards is able to adapt the warning signal with personalized Risk Maps. In experiments, we show examples for longitudinal following and intersection scenarios in which the novel warning system can effectively reduce false negative errors and false positive errors compared to a baseline approach which does not use personalized driver considerations. This underlines the potential of personalization for reducing warning errors in risk warning and driver support.
E2H: A Two-Stage Non-Invasive Neural Signal Driven Humanoid Robotic Whole-Body Control Framework
Recent advancements in humanoid robotics, including the integration of hierarchical reinforcement learning-based control and the utilization of LLM planning, have significantly enhanced the ability of robots to perform complex tasks. In contrast to the highly developed humanoid robots, the human factors involved remain relatively unexplored. Directly controlling humanoid robots with the brain has already appeared in many science fiction novels, such as Pacific Rim and Gundam. In this work, we present E2H (EEG-to-Humanoid), an innovative framework that pioneers the control of humanoid robots using high-frequency non-invasive neural signals. As the none-invasive signal quality remains low in decoding precise spatial trajectory, we decompose the E2H framework in an innovative two-stage formation: 1) decoding neural signals (EEG) into semantic motion keywords, 2) utilizing LLM facilitated motion generation with a precise motion imitation control policy to realize humanoid robotics control. The method of directly driving robots with brainwave commands offers a novel approach to human-machine collaboration, especially in situations where verbal commands are impractical, such as in cases of speech impairments, space exploration, or underwater exploration, unlocking significant potential. E2H offers an exciting glimpse into the future, holding immense potential for human-computer interaction.
Safe Navigation in Unmapped Environments for Robotic Systems with Input Constraints
This paper presents an approach for navigation and control in unmapped environments under input and state constraints using a composite control barrier function (CBF). We consider the scenario where real-time perception feedback (e.g., LiDAR) is used online to construct a local CBF that models local state constraints (e.g., local safety constraints such as obstacles) in the a priori unmapped environment. The approach employs a soft-maximum function to synthesize a single time-varying CBF from the N most recently obtained local CBFs. Next, the input constraints are transformed into controller-state constraints through the use of control dynamics. Then, we use a soft-minimum function to compose the input constraints with the time-varying CBF that models the a priori unmapped environment. This composition yields a single relaxed CBF, which is used in a constrained optimization to obtain an optimal control that satisfies the state and input constraints. The approach is validated through simulations of a nonholonomic ground robot that is equipped with LiDAR and navigates an unmapped environment. The robot successfully navigates the environment while avoiding the a priori unmapped obstacles and satisfying both speed and input constraints.
comment: Preprint submitted to 2025 American Control Conference (ACC). arXiv admin note: substantial text overlap with arXiv:2409.01458
SPINE: Online Semantic Planning for Missions with Incomplete Natural Language Specifications in Unstructured Environments
As robots become increasingly capable, users will want to describe high-level missions and have robots fill in the gaps. In many realistic settings, pre-built maps are difficult to obtain, so execution requires exploration and mapping that are necessary and specific to the mission. Consider an emergency response scenario where a user commands a robot, "triage impacted regions." The robot must infer relevant semantics (victims, etc.) and exploration targets (damaged regions) based on priors or other context, then explore and refine its plan online. These missions are incompletely specified, meaning they imply subtasks and semantics. While many semantic planning methods operate online, they are typically designed for well specified tasks such as object search or exploration. Recently, Large Language Models (LLMs) have demonstrated powerful contextual reasoning over a range of robotic tasks described in natural language. However, existing LLM planners typically do not consider online planning or complex missions; rather, relevant subtasks are provided by a pre-built map or a user. We address these limitations via SPINE (online Semantic Planner for missions with Incomplete Natural language specifications in unstructured Environments). SPINE uses an LLM to reason about subtasks implied by the mission then realizes these subtasks in a receding horizon framework. Tasks are automatically validated for safety and refined online with new observations. We evaluate SPINE in simulation and real-world settings. Evaluation missions require multiple steps of semantic reasoning and exploration in cluttered outdoor environments of over 20,000m$^2$ area. We evaluate SPINE against competitive baselines in single-agent and air-ground teaming applications. Please find videos and software on our project page: https://zacravichandran.github.io/SPINE
Single-Shot 6DoF Pose and 3D Size Estimation for Robotic Strawberry Harvesting IROS 2024
In this study, we introduce a deep-learning approach for determining both the 6DoF pose and 3D size of strawberries, aiming to significantly augment robotic harvesting efficiency. Our model was trained on a synthetic strawberry dataset, which is automatically generated within the Ignition Gazebo simulator, with a specific focus on the inherent symmetry exhibited by strawberries. By leveraging domain randomization techniques, the model demonstrated exceptional performance, achieving an 84.77\% average precision (AP) of 3D Intersection over Union (IoU) scores on the simulated dataset. Empirical evaluations, conducted by testing our model on real-world datasets, underscored the model's viability for real-world strawberry harvesting scenarios, even though its training was based on synthetic data. The model also exhibited robust occlusion handling abilities, maintaining accurate detection capabilities even when strawberries were obscured by other strawberries or foliage. Additionally, the model showcased remarkably swift inference speeds, reaching up to 60 frames per second (FPS).
comment: Accepted at IROS 2024
Task-unaware Lifelong Robot Learning with Retrieval-based Weighted Local Adaptation
Real-world environments require robots to continuously acquire new skills while retaining previously learned abilities, all without the need for clearly defined task boundaries. Storing all past data to prevent forgetting is impractical due to storage and privacy concerns. To address this, we propose a method that efficiently restores a robot's proficiency in previously learned tasks over its lifespan. Using an Episodic Memory (EM), our approach enables experience replay during training and retrieval during testing for local fine-tuning, allowing rapid adaptation to previously encountered problems without explicit task identifiers. Additionally, we introduce a selective weighting mechanism that emphasizes the most challenging segments of retrieved demonstrations, focusing local adaptation where it is most needed. This framework offers a scalable solution for lifelong learning in dynamic, task-unaware environments, combining retrieval-based adaptation with selective weighting to enhance robot performance in open-ended scenarios.
Information-Driven Search and Track of Novel Space Objects
Space surveillance depends on efficiently directing sensor resources to maintain custody of known catalog objects. However, it remains unclear how to best utilize these resources to rapidly search for and track newly detected space objects. Provided a novel measurement, a search set can be instantiated through admissible region constraints to inform follow-up observations. In lacking well-constrained bounds, this set rapidly spreads in the along-track direction, growing much larger than a follow-up sensor's finite field of view. Moreover, the number of novel objects may be uncertain, and follow-up observations are most commonly corrupted by false positives from known catalog objects and missed detections. In this work, we address these challenges through the introduction of a joint sensor control and multi-target tracking approach. The search set associated to a novel measurement is represented by a Cardinalized Probability Hypothesis Density (CPHD), which jointly tracks the state uncertainty associated to a set of objects and a probability mass function for the true target number. In follow-up sensor scans, the information contained in an empty measurement set, and returns from both novel objects and known catalog objects is succinctly captured through this paradigm. To maximize the utility of a follow-up sensor, we introduce an information-driven sensor control approach for steering the instrument. Our methods are tested on two relevant test cases and we provide a comparative analysis with current naive tasking strategies.
comment: Submitted to the Journal of Astronautical Sciences
DecTrain: Deciding When to Train a DNN Online
Deep neural networks (DNNs) can deteriorate in accuracy when deployment data differs from training data. While performing online training at all timesteps can improve accuracy, it is computationally expensive. We propose DecTrain, a new algorithm that decides when to train a monocular depth DNN online using self-supervision with low overhead. To make the decision at each timestep, DecTrain compares the cost of training with the predicted accuracy gain. We evaluate DecTrain on out-of-distribution data, and find DecTrain maintains accuracy compared to online training at all timesteps, while training only 44% of the time on average. We also compare the recovery of a low inference cost DNN using DecTrain and a more generalizable high inference cost DNN on various sequences. DecTrain recovers the majority (97%) of the accuracy gain of online training at all timesteps while reducing computation compared to the high inference cost DNN which recovers only 66%. With an even smaller DNN, we achieve 89% recovery while reducing computation by 56%. DecTrain enables low-cost online training for a smaller DNN to have competitive accuracy with a larger, more generalizable DNN at a lower overall computational cost.
comment: 8 pages
Self-Deployable, Adaptive Soft Robots Based on Contracting-Cord Particle Jamming
We developed a new class of soft locomotive robots that can self-assemble into a preprogrammed configuration and vary their stiffness afterward in a highly integrated, compact body using contracting-cord particle jamming (CCPJ). We demonstrate this with a tripod-shaped robot, TripodBot, consisting of three CCPJ-based legs attached to a central body. TripodBot is intrinsically soft and can be stored and transported in a compact configuration. On site, it can self-deploy and crawl in a slip-stick manner through the shape morphing of its legs; a simplified analytical model accurately captures the speed. The robot's adaptability is demonstrated by its ability to navigate tunnels as narrow as 61 percent of its deployed body width and ceilings as low as 31 percent of its freestanding height. Additionally, it can climb slopes up to 15 degrees, carry a load of 5 grams (2.4 times its weight), and bear a load 9429 times its weight.
comment: 15 figures
LiDAR Inertial Odometry And Mapping Using Learned Registration-Relevant Features
SLAM is an important capability for many autonomous systems, and modern LiDAR-based methods offer promising performance. However, for long duration missions, existing works that either operate directly the full pointclouds or on extracted features face key tradeoffs in accuracy and computational efficiency (e.g., memory consumption). To address these issues, this paper presents DFLIOM with several key innovations. Unlike previous methods that rely on handcrafted heuristics and hand-tuned parameters for feature extraction, we propose a learning-based approach that select points relevant to LiDAR SLAM pointcloud registration. Furthermore, we extend our prior work DLIOM with the learned feature extractor and observe our method enables similar or even better localization performance using only about 20\% of the points in the dense point clouds. We demonstrate that DFLIOM performs well on multiple public benchmarks, achieving a 2.4\% decrease in localization error and 57.5\% decrease in memory usage compared to state-of-the-art methods (DLIOM). Although extracting features with the proposed network requires extra time, it is offset by the faster processing time downstream, thus maintaining real-time performance using 20Hz LiDAR on our hardware setup. The effectiveness of our learning-based feature extraction module is further demonstrated through comparison with several handcrafted feature extractors.
comment: 8 pages, 6 figures
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy. We prove that this method results in the maximal robust RAS set in the absence of training errors and demonstrate that it enables RAS in complex environments, scales to high-dimensional systems, and achieves higher success rates for the RAS task compared to previous methods, validated through one simulation and two high-dimensional experiments.
Gait Optimization for Legged Systems Through Mixed Distribution Cross-Entropy Optimization
Legged robotic systems can play an important role in real-world applications due to their superior load-bearing capabilities, enhanced autonomy, and effective navigation on uneven terrain. They offer an optimal trade-off between mobility and payload capacity, excelling in diverse environments while maintaining efficiency in transporting heavy loads. However, planning and optimizing gaits and gait sequences for these robots presents significant challenges due to the complexity of their dynamic motion and the numerous optimization variables involved. Traditional trajectory optimization methods address these challenges by formulating the problem as an optimization task, aiming to minimize cost functions, and to automatically discover contact sequences. Despite their structured approach, optimization-based methods face substantial difficulties, particularly because such formulations result in highly nonlinear and difficult to solve problems. To address these limitations, we propose CrEGOpt, a bi-level optimization method that combines traditional trajectory optimization with a black-box optimization scheme. CrEGOpt at the higher level employs the Mixed Distribution Cross-Entropy Method to optimize both the gait sequence and the phase durations, thus simplifying the lower level trajectory optimization problem. This approach allows for fast solutions of complex gait optimization problems. Extensive evaluation in simulated environments demonstrates that CrEGOpt can find solutions for biped, quadruped, and hexapod robots in under 10 seconds. This novel bi-level optimization scheme offers a promising direction for future research in automatic contact scheduling.
comment: 8 pages, 7 figures, Accepted at Humanoids 2024
Real-World Cooking Robot System from Recipes Based on Food State Recognition Using Foundation Models and PDDL
Although there is a growing demand for cooking behaviours as one of the expected tasks for robots, a series of cooking behaviours based on new recipe descriptions by robots in the real world has not yet been realised. In this study, we propose a robot system that integrates real-world executable robot cooking behaviour planning using the Large Language Model (LLM) and classical planning of PDDL descriptions, and food ingredient state recognition learning from a small number of data using the Vision-Language model (VLM). We succeeded in experiments in which PR2, a dual-armed wheeled robot, performed cooking from arranged new recipes in a real-world environment, and confirmed the effectiveness of the proposed system.
comment: Accepted at Advanced Robotics
CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series
The study of cause-and-effect is of the utmost importance in many branches of science, but also for many practical applications of intelligent systems. In particular, identifying causal relationships in situations that include hidden factors is a major challenge for methods that rely solely on observational data for building causal models. This paper proposes CAnDOIT, a causal discovery method to reconstruct causal models using both observational and interventional time-series data. The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics, where the scenario is highly complex and observational data alone are often insufficient to uncover the correct causal structure. Validation of the method is performed initially on randomly generated synthetic models and subsequently on a well-known benchmark for causal structure learning in a robotic manipulation environment. The experiments demonstrate that the approach can effectively handle data from interventions and exploit them to enhance the accuracy of the causal analysis. A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub: https://github.com/lcastri/causalflow.
comment: Published in Advanced Intelligent Systems
CMP: Cooperative Motion Prediction with Multi-Agent Communication
The confluence of the advancement of Autonomous Vehicles (AVs) and the maturity of Vehicle-to-Everything (V2X) communication has enabled the capability of cooperative connected and automated vehicles (CAVs). Building on top of cooperative perception, this paper explores the feasibility and effectiveness of cooperative motion prediction. Our method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities. Unlike previous work that focuses separately on either cooperative perception or motion prediction, our framework, to the best of our knowledge, is the first to address the unified problem where CAVs share information in both perception and prediction modules. Incorporated into our design is the unique capability to tolerate realistic V2X bandwidth limitations and transmission delays, while dealing with bulky perception representations. We also propose a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction. Through extensive experiments and ablation studies on the OPV2V and V2V4Real datasets, we demonstrate the effectiveness of our method in cooperative perception, tracking, and motion prediction. In particular, CMP reduces the average prediction error by 16.4\% with fewer missing detections compared with the no cooperation setting and by 12.3\% compared with the strongest baseline. Our work marks a significant step forward in the cooperative capabilities of CAVs, showcasing enhanced performance in complex scenarios. The code can be found on the project website: https://cmp-cooperative-prediction.github.io/.
comment: Project website: https://cmp-cooperative-prediction.github.io/
Trajectory Optimization with Global Yaw Parameterization for Field-of-View Constrained Autonomous Flight
Trajectory generation for quadrotors with limited field-of-view sensors has numerous applications such as aerial exploration, coverage, inspection, videography, and target tracking. Most previous works simplify the task of optimizing yaw trajectories by either aligning the heading of the robot with its velocity, or potentially restricting the feasible space of candidate trajectories by using a limited yaw domain to circumvent angular singularities. In this paper, we propose a novel \textit{global} yaw parameterization method for trajectory optimization that allows a 360-degree yaw variation as demanded by the underlying algorithm. This approach effectively bypasses inherent singularities by including supplementary quadratic constraints and transforming the final decision variables into the desired state representation. This method significantly reduces the needed control effort, and improves optimization feasibility. Furthermore, we apply the method to several examples of different applications that require jointly optimizing over both the yaw and position trajectories. Ultimately, we present a comprehensive numerical analysis and evaluation of our proposed method in both simulation and real-world experiments.
PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Heuristic-based Sampling EMNLP 2024
Prompt optimization aims to find the best prompt to a large language model (LLM) for a given task. LLMs have been successfully used to help find and improve prompt candidates for single-step tasks. However, realistic tasks for agents are multi-step and introduce new challenges: (1) Prompt content is likely to be more extensive and complex, making it more difficult for LLMs to analyze errors, (2) the impact of an individual step is difficult to evaluate, and (3) different people may have varied preferences about task execution. While humans struggle to optimize prompts, they are good at providing feedback about LLM outputs; we therefore introduce a new LLM-driven discrete prompt optimization framework PRompt Optimization in Multi-Step Tasks (PROMST) that incorporates human-designed feedback rules to automatically offer direct suggestions for improvement. We also use an extra learned heuristic model that predicts prompt performance to efficiently sample from prompt candidates. This approach significantly outperforms both human-engineered prompts and several other prompt optimization methods across 11 representative multi-step tasks (an average 10.6\%-29.3\% improvement to current best methods on five LLMs respectively). We believe our work can serve as a benchmark for automatic prompt optimization for LLM-driven multi-step tasks. Datasets and Codes are available at https://github.com/yongchao98/PROMST. Project Page is available at https://yongchao98.github.io/MIT-REALM-PROMST.
comment: 62 pages, 14 figures, Published in EMNLP 2024 Main
$\mathcal{D(R,O)}$ Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping
Dexterous grasping is a fundamental yet challenging skill in robotic manipulation, requiring precise interaction between robotic hands and objects. In this paper, we present $\mathcal{D(R,O)}$ Grasp, a novel framework that models the interaction between the robotic hand in its grasping pose and the object, enabling broad generalization across various robot hands and object geometries. Our model takes the robot hand's description and object point cloud as inputs and efficiently predicts kinematically valid and stable grasps, demonstrating strong adaptability to diverse robot embodiments and object geometries. Extensive experiments conducted in both simulated and real-world environments validate the effectiveness of our approach, with significant improvements in success rate, grasp diversity, and inference speed across multiple robotic hands. Our method achieves an average success rate of 87.53% in simulation in less than one second, tested across three different dexterous robotic hands. In real-world experiments using the LeapHand, the method also demonstrates an average success rate of 89%. $\mathcal{D(R,O)}$ Grasp provides a robust solution for dexterous grasping in complex and varied environments. The code, appendix, and videos are available on our project website at https://nus-lins-lab.github.io/drograspweb/.
Making Space for Time: The Special Galilean Group and Its Application to Some Robotics Problems IROS
The special Galilean group, usually denoted SGal(3), is a 10-dimensional Lie group whose important subgroups include the special orthogonal group, the special Euclidean group, and the group of extended poses. We briefly describe SGal(3) and its Lie algebra and show how the group structure supports a unified representation of uncertainty in space and time. Our aim is to highlight the potential usefulness of this group for several robotics problems.
comment: In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Workshop From Geometry to General Autonomy of Robotic Systems, Abu Dhabi, United Arab Emirates, October 15, 2024. 3 pages, 1 figure
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation
There is no limit to how much a robot might explore and learn, but all of that knowledge needs to be searchable and actionable. Within language research, retrieval augmented generation (RAG) has become the workhouse of large-scale non-parametric knowledge, however existing techniques do not directly transfer to the embodied domain, which is multimodal, data is highly correlated, and perception requires abstraction. To address these challenges, we introduce Embodied-RAG, a framework that enhances the foundational model of an embodied agent with a non-parametric memory system capable of autonomously constructing hierarchical knowledge for both navigation and language generation. Embodied-RAG handles a full range of spatial and semantic resolutions across diverse environments and query types, whether for a specific object or a holistic description of ambiance. At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail. This hierarchical organization allows the system to efficiently generate context-sensitive outputs across different robotic platforms. We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 200 explanation and navigation queries across 19 environments, highlighting its promise for general-purpose non-parametric system for embodied agents.
comment: Web: https://quanting-xie.github.io/Embodied-RAG-web/
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training NeurIPS 2024
Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks and interactions with the physical world. Promising prospects arise for utilizing actionless human videos for pre-training and transferring the knowledge to facilitate robot policy learning through limited robot demonstrations. However, it remains a challenge due to the domain gap between humans and robots. Moreover, it is difficult to extract useful information representing the dynamic world from human videos, because of its noisy and multimodal data structure. In this paper, we introduce a novel framework to tackle these challenges, which leverages a unified discrete diffusion to combine generative pre-training on human videos and policy fine-tuning on a small number of action-labeled robot videos. We start by compressing both human and robot videos into unified video tokens. In the pre-training stage, we employ a discrete diffusion model with a mask-and-replace diffusion strategy to predict future video tokens in the latent space. In the fine-tuning stage, we harness the imagined future videos to guide low-level action learning with a limited set of robot data. Experiments demonstrate that our method generates high-fidelity future videos for planning and enhances the fine-tuned policies compared to previous state-of-the-art approaches with superior performance. Our project website is available at https://video-diff.github.io/.
comment: Accepted by NeurIPS 2024. 24 pages
ViewActive: Active viewpoint optimization from a single image
When observing objects, humans benefit from their spatial visualization and mental rotation ability to envision potential optimal viewpoints based on the current observation. This capability is crucial for enabling robots to achieve efficient and robust scene perception during operation, as optimal viewpoints provide essential and informative features for accurately representing scenes in 2D images, thereby enhancing downstream tasks. To endow robots with this human-like active viewpoint optimization capability, we propose ViewActive, a modernized machine learning approach drawing inspiration from aspect graph, which provides viewpoint optimization guidance based solely on the current 2D image input. Specifically, we introduce the 3D Viewpoint Quality Field (VQF), a compact and consistent representation for viewpoint quality distribution similar to an aspect graph, composed of three general-purpose viewpoint quality metrics: self-occlusion ratio, occupancy-aware surface normal entropy, and visual entropy. We utilize pre-trained image encoders to extract robust visual and semantic features, which are then decoded into the 3D VQF, allowing our model to generalize effectively across diverse objects, including unseen categories.The lightweight ViewActive network (72 FPS on a single GPU) significantly enhances the performance of state-of-the-art object recognition pipelines and can be integrated into real-time motion planning for robotic applications. Our code and dataset are available here: https://github.com/jiayi-wu-umd/ViewActive
SonicSense: Object Perception from In-Hand Acoustic Vibration
We introduce SonicSense, a holistic design of hardware and software to enable rich robot object perception through in-hand acoustic vibration sensing. While previous studies have shown promising results with acoustic sensing for object perception, current solutions are constrained to a handful of objects with simple geometries and homogeneous materials, single-finger sensing, and mixing training and testing on the same objects. SonicSense enables container inventory status differentiation, heterogeneous material prediction, 3D shape reconstruction, and object re-identification from a diverse set of 83 real-world objects. Our system employs a simple but effective heuristic exploration policy to interact with the objects as well as end-to-end learning-based algorithms to fuse vibration signals to infer object properties. Our framework underscores the significance of in-hand acoustic vibration sensing in advancing robot tactile perception.
comment: Our project website is at: http://generalroboticslab.com/SonicSense
BadRobot: Manipulating Embodied LLMs in the Physical World
Embodied AI represents systems where AI is integrated into physical entities, enabling them to perceive and interact with their surroundings. Large Language Model (LLM), which exhibits powerful language understanding abilities, has been extensively employed in embodied AI by facilitating sophisticated task planning. However, a critical safety issue remains overlooked: could these embodied LLMs perpetrate harmful behaviors? In response, we introduce BadRobot, a novel attack paradigm aiming to make embodied LLMs violate safety and ethical constraints through typical voice-based user-system interactions. Specifically, three vulnerabilities are exploited to achieve this type of attack: (i) manipulation of LLMs within robotic systems, (ii) misalignment between linguistic outputs and physical actions, and (iii) unintentional hazardous behaviors caused by world knowledge's flaws. Furthermore, we construct a benchmark of various malicious physical action queries to evaluate BadRobot's attack performance. Based on this benchmark, extensive experiments against existing prominent embodied LLM frameworks (e.g., Voxposer, Code as Policies, and ProgPrompt) demonstrate the effectiveness of our BadRobot. Warning: This paper contains harmful AI-generated language and aggressive actions.
comment: 38 pages, 16 figures
Theory and Explicit Design of a Path Planner for an SE(3) Robot
We consider path planning for a rigid spatial robot with 6 degrees of freedom (6 DOFs), moving amidst polyhedral obstacles. A correct, complete and practical path planner for such a robot has never been achieved, although this is widely recognized as a key challenge in robotics. This paper provides a complete "explicit" design, down to explicit geometric primitives that are easily implementable. Our design is within an algorithmic framework for path planners, called Soft Subdivision Search (SSS). The framework is based on the twin foundations of $\epsilon$-exactness and soft predicates, which are critical for rigorous numerical implementations. The practicality of SSS has been previously demonstrated for various robots including 5-DOF spatial robots. In this paper, we solve several significant technical challenges for SE(3) robots: (1) We first ensure the correct theory by proving a general form of the Fundamental Theorem of the SSS theory. We prove this within an axiomatic framework, thus making it easy for future applications of this theory. (2) One component of $SE(3) = R^3 \times SO(3)$ is the non-Euclidean space SO(3). We design a novel topologically correct data structure for SO(3). Using the concept of subdivision charts and atlases for SO(3), we can now carry out subdivision of SO(3). (3) The geometric problem of collision detection takes place in $R^3$, via the footprint map. Unlike sampling-based approaches, we must reason with the notion of footprints of configuration boxes, which is much harder to characterize. Exploiting the theory of soft predicates, we design suitable approximate footprints which, when combined with the highly effective feature-set technique, lead to soft predicates. (4) Finally, we make the underlying geometric computation "explicit", i.e., avoiding a general solver of polynomial systems, in order to allow a direct implementation.
comment: A conference version is to appear at the International Workshop on the Algorithmic Foundations of Robotics (WAFR) 2024. This is a revised full version, 42 pages, including 5 appendices
A Causal Bayesian Network and Probabilistic Programming Based Reasoning Framework for Robot Manipulation Under Uncertainty ICRA 2025
Robot object manipulation in real-world environments is challenging because robot operation must be robust to a range of sensing, estimation, and actuation uncertainties to avoid potentially unsafe and costly mistakes that are a barrier to their adoption. In this paper, we propose a flexible and generalisable physics-informed causal Bayesian network (CBN) based framework for a robot to probabilistically reason about candidate manipulation actions, to enable robot decision-making robust to arbitrary robot system uncertainties -- the first of its kind to use a probabilistic programming language implementation. Using experiments in high-fidelity Gazebo simulation of an exemplar block stacking task, we demonstrate our framework's ability to: (1) predict manipulation outcomes with high accuracy (Pred Acc: 88.6%); and, (2) perform greedy next-best action selection with 94.2% task success rate. We also demonstrate our framework's suitability for real-world robot systems with a domestic robot. Thus, we show that by combining probabilistic causal modelling with physics simulations, we can make robot manipulation more robust to system uncertainties and hence more feasible for real-world applications. Further, our generalised reasoning framework can be used and extended for future robotics and causality research.
comment: 7 pages, 7 figures, submitted to the 2025 IEEE Conference on Robotics and Automation (ICRA 2025)
PointNetPGAP-SLC: A 3D LiDAR-based Place Recognition Approach with Segment-level Consistency Training for Mobile Robots in Horticulture
3D LiDAR-based place recognition remains largely underexplored in horticultural environments, which present unique challenges due to their semi-permeable nature to laser beams. This characteristic often results in highly similar LiDAR scans from adjacent rows, leading to descriptor ambiguity and, consequently, compromised retrieval performance. In this work, we address the challenges of 3D LiDAR place recognition in horticultural environments, particularly focusing on inter-row ambiguity by introducing three key contributions: (i) a novel model, PointNetPGAP, which combines the outputs of two statistically-inspired aggregators into a single descriptor; (ii) a Segment-Level Consistency (SLC) model, used exclusively during training to enhance descriptor robustness; and (iii) the HORTO-3DLM dataset, comprising LiDAR sequences from orchards and strawberry fields. Experimental evaluations conducted on the HORTO-3DLM and KITTI Odometry datasets demonstrate that PointNetPGAP outperforms state-of-the-art models, including OverlapTransformer and PointNetVLAD, particularly when the SLC model is applied. These results underscore the model's superiority, especially in horticultural environments, by significantly improving retrieval performance in segments with higher ambiguity.
comment: This preprint has been accepted for publication in IEEE Robotics and Automation Letters
RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation
We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot Manipulation imitation learning framework from scene point cloud input. Compared to previous methods that rely on descriptor field matching, RiEMann directly predicts the target poses of objects for manipulation without any object segmentation. RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, generalizes to unseen SE(3) transformations and instances of target objects, resists visual interference of distracting objects, and follows the near real-time pose change of the target object. The scalable action space of RiEMann facilitates the addition of custom equivariant actions such as the direction of turning the faucet, which makes articulated object manipulation possible for RiEMann. In simulation and real-world 6-DOF robot manipulation experiments, we test RiEMann on 5 categories of manipulation tasks with a total of 25 variants and show that RiEMann outperforms baselines in both task success rates and SE(3) geodesic distance errors on predicted poses (reduced by 68.6%), and achieves a 5.4 frames per second (FPS) network inference speed. Code and video results are available at https://riemann-web.github.io/.
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own
Reinforcement learning (RL) is a promising approach for solving robotic manipulation tasks. However, it is challenging to apply the RL algorithms directly in the real world. For one thing, RL is data-intensive and typically requires millions of interactions with environments, which are impractical in real scenarios. For another, it is necessary to make heavy engineering efforts to design reward functions manually. To address these issues, we leverage foundation models in this paper. We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models. Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions. The benefits of our framework are threefold: (1) \textit{sample efficient}; (2) \textit{minimal and effective reward engineering}; (3) \textit{agnostic to foundation model forms and robust to noisy priors}. Our method achieves remarkable performances in various manipulation tasks on both real robots and in simulation. Across 5 dexterous tasks with real robots, FAC achieves an average success rate of 86\% after one hour of real-time learning. Across 8 tasks in the simulated Meta-world, FAC achieves 100\% success rates in 7/8 tasks under less than 100k frames (about 1-hour training), outperforming baseline methods with manual-designed rewards in 1M frames. We believe the RLFP framework can enable future robots to explore and learn autonomously in the physical world for more tasks.
comment: CoRL 2024 (Oral)
NeRFoot: Robot-Footprint Estimation for Image-Based Visual Servoing ICRA
This paper investigates the utility of Neural Radiance Fields (NeRF) models in extending the regions of operation of a mobile robot, controlled by Image-Based Visual Servoing (IBVS) via static CCTV cameras. Using NeRF as a 3D-representation prior, the robot's footprint may be extrapolated geometrically and used to train a CNN-based network to extract it online from the robot's appearance alone. The resulting footprint results in a tighter bound than a robot-wide bounding box, allowing the robot's controller to prescribe more optimal trajectories and expand its safe operational floor area.
comment: Accepted as extended abstract for ICRA@40
VLM-MPC: Vision Language Foundation Model (VLM)-Guided Model Predictive Controller (MPC) for Autonomous Driving
Motivated by the emergent reasoning capabilities of Vision Language Models (VLMs) and their potential to improve the comprehensibility of autonomous driving systems, this paper introduces a closed-loop autonomous driving controller called VLM-MPC, which combines the Model Predictive Controller (MPC) with VLM to evaluate how model-based control could enhance VLM decision-making. The proposed VLM-MPC is structured into two asynchronous components: The upper layer VLM generates driving parameters (e.g., desired speed, desired headway) for lower-level control based on front camera images, ego vehicle state, traffic environment conditions, and reference memory; The lower-level MPC controls the vehicle in real-time using these parameters, considering engine lag and providing state feedback to the entire system. Experiments based on the nuScenes dataset validated the effectiveness of the proposed VLM-MPC across various environments (e.g., night, rain, and intersections). The results demonstrate that the VLM-MPC consistently maintains Post Encroachment Time (PET) above safe thresholds, in contrast to some scenarios where the VLM-based control posed collision risks. Additionally, the VLM-MPC enhances smoothness compared to the real-world trajectories and VLM-based control. By comparing behaviors under different environmental settings, we highlight the VLM-MPC's capability to understand the environment and make reasoned inferences. Moreover, we validate the contributions of two key components, the reference memory and the environment encoder, to the stability of responses through ablation tests.
Multi-Robot Relative Pose Estimation and IMU Preintegration Using Passive UWB Transceivers
Ultra-wideband (UWB) systems are becoming increasingly popular as a means of inter-robot ranging and communication. A major constraint associated with UWB is that only one pair of UWB transceivers can range at a time to avoid interference, hence hindering the scalability of UWB-based localization. In this paper, a ranging protocol is proposed that allows all robots to passively listen on neighbouring communicating robots without any hierarchical restrictions on the role of the robots. This is utilized to allow each robot to obtain more range measurements and to broadcast preintegrated inertial measurement unit (IMU) measurements for relative extended pose state estimation directly on SE2(3). Consequently, a simultaneous clock-synchronization and relative-pose estimator (CSRPE) is formulated using an on-manifold extended Kalman filter (EKF) and is evaluated in simulation using Monte-Carlo runs for up to 7 robots. The ranging protocol is implemented in C on custom-made UWB boards fitted to 3 quadcopters, and the proposed filter is evaluated over multiple experimental trials, yielding up to 48% improvement in localization accuracy.
Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
In the endeavor to make autonomous robots take actions, task planning is a major challenge that requires translating high-level task descriptions into long-horizon action sequences. Despite recent advances in language model agents, they remain prone to planning errors and limited in their ability to plan ahead. To address these limitations in robotic planning, we advocate a self-refining scheme that iteratively refines a draft plan until an equilibrium is reached. Remarkably, this process can be optimized end-to-end from an analytical perspective without the need to curate additional verifiers or reward models, allowing us to train self-refining planners in a simple supervised learning fashion. Meanwhile, a nested equilibrium sequence modeling procedure is devised for efficient closed-loop planning that incorporates useful feedback from the environment (or an internal world model). Our method is evaluated on the VirtualHome-Env benchmark, showing advanced performance with better scaling for inference computation. Code is available at https://github.com/Singularity0104/equilibrium-planner.
BVE + EKF: A viewpoint estimator for the estimation of the object's position in the 3D task space using Extended Kalman Filters
RGB-D sensors face multiple challenges operating under open-field environments because of their sensitivity to external perturbations such as radiation or rain. Multiple works are approaching the challenge of perceiving the 3D position of objects using monocular cameras. However, most of these works focus mainly on deep learning-based solutions, which are complex, data-driven, and difficult to predict. So, we aim to approach the problem of predicting the 3D objects' position using a Gaussian viewpoint estimator named best viewpoint estimator (BVE) powered by an extended Kalman filter (EKF). The algorithm proved efficient on the tasks and reached a maximum average Euclidean error of about 32 mm. The experiments were deployed and evaluated in MATLAB using artificial Gaussian noise. Future work aims to implement the system in a robotic system.
comment: Accepted to ICINCO - 21st International Conference on Informatics in Control, Automation and Robotics
MonoVisual3DFilter: 3D tomatoes' localisation with monocular cameras using histogram filters
Performing tasks in agriculture, such as fruit monitoring or harvesting, requires perceiving the objects' spatial position. RGB-D cameras are limited under open-field environments due to lightning interferences. So, in this study, we state to answer the research question: "How can we use and control monocular sensors to perceive objects' position in the 3D task space?" Towards this aim, we approached histogram filters (Bayesian discrete filters) to estimate the position of tomatoes in the tomato plant through the algorithm MonoVisual3DFilter. Two kernel filters were studied: the square kernel and the Gaussian kernel. The implemented algorithm was essayed in simulation, with and without Gaussian noise and random noise, and in a testbed at laboratory conditions. The algorithm reported a mean absolute error lower than 10 mm in simulation and 20 mm in the testbed at laboratory conditions with an assessing distance of about 0.5 m. So, the results are viable for real environments and should be improved at closer distances.
Hybrid Feedback for Three-dimensional Convex Obstacle Avoidance (Extended version)
We propose a hybrid feedback control scheme for the autonomous robot navigation problem in three-dimensional environments with arbitrarily-shaped convex obstacles. The proposed hybrid control strategy, which consists in switching between the move-to-target mode and the obstacle-avoidance mode, guarantees global asymptotic stability of the target location in the obstacle-free workspace. We also provide a procedure for the implementation of the proposed hybrid controller in a priori unknown environments and validate its effectiveness through simulation results.
comment: 13 pages, 6 figures
DiffuSolve: Diffusion-based Solver for Non-convex Trajectory Optimization
Optimal trajectory design is computationally expensive for nonlinear and high-dimensional dynamical systems. The challenge arises from the non-convex nature of the optimization problem with multiple local optima, which usually requires a global search. Traditional numerical solvers struggle to find diverse solutions efficiently without appropriate initial guesses. In this paper, we introduce DiffuSolve, a general diffusion model-based solver for non-convex trajectory optimization. An expressive diffusion model is trained on pre-collected locally optimal solutions and efficiently samples initial guesses, which then warm-starts numerical solvers to fine-tune the feasibility and optimality. We also present DiffuSolve+, a novel constrained diffusion model with an additional loss in training that further reduces the problem constraint violations of diffusion samples. Experimental evaluations on three tasks verify the improved robustness, diversity, and a 2$\times$ to 11$\times$ increase in computational efficiency with our proposed method, which generalizes well to trajectory optimization problems of varying challenges.
Synergizing Quality-Diversity with Descriptor-Conditioned Reinforcement Learning
A hallmark of intelligence is the ability to exhibit a wide range of effective behaviors. Inspired by this principle, Quality-Diversity algorithms, such as MAP-Elites, are evolutionary methods designed to generate a set of diverse and high-fitness solutions. However, as a genetic algorithm, MAP-Elites relies on random mutations, which can become inefficient in high-dimensional search spaces, thus limiting its scalability to more complex domains, such as learning to control agents directly from high-dimensional inputs. To address this limitation, advanced methods like PGA-MAP-Elites and DCG-MAP-Elites have been developed, which combine actor-critic techniques from Reinforcement Learning with MAP-Elites, significantly enhancing the performance and efficiency of Quality-Diversity algorithms in complex, high-dimensional tasks. While these methods have successfully leveraged the trained critic to guide more effective mutations, the potential of the trained actor remains underutilized in improving both the quality and diversity of the evolved population. In this work, we introduce DCRL-MAP-Elites, an extension of DCG-MAP-Elites that utilizes the descriptor-conditioned actor as a generative model to produce diverse solutions, which are then injected into the offspring batch at each generation. Additionally, we present an empirical analysis of the fitness and descriptor reproducibility of the solutions discovered by each algorithm. Finally, we present a second empirical analysis shedding light on the synergies between the different variations operators and explaining the performance improvement from PGA-MAP-Elites to DCRL-MAP-Elites.
comment: arXiv admin note: text overlap with arXiv:2303.03832
Imitation Learning from Observation through Optimal Transport
Imitation Learning from Observation (ILfO) is a setting in which a learner tries to imitate the behavior of an expert, using only observational data and without the direct guidance of demonstrated actions. In this paper, we re-examine optimal transport for IL, in which a reward is generated based on the Wasserstein distance between the state trajectories of the learner and expert. We show that existing methods can be simplified to generate a reward function without requiring learned models or adversarial learning. Unlike many other state-of-the-art methods, our approach can be integrated with any RL algorithm and is amenable to ILfO. We demonstrate the effectiveness of this simple approach on a variety of continuous control tasks and find that it surpasses the state of the art in the IlfO setting, achieving expert-level performance across a range of evaluation domains even when observing only a single expert trajectory without actions.
comment: Update to newest version, presented at RLC 2024
Multiagent Systems
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
Recent progress in generative models has stimulated significant innovations in many fields, such as image generation and chatbots. Despite their success, these models often produce sketchy and misleading solutions for complex multi-agent decision-making problems because they miss the trial-and-error experience and reasoning as humans. To address this limitation, we explore a paradigm that integrates a language-guided simulator into the multi-agent reinforcement learning pipeline to enhance the generated answer. The simulator is a world model that separately learns dynamics and reward, where the dynamics model comprises an image tokenizer as well as a causal transformer to generate interaction transitions autoregressively, and the reward model is a bidirectional transformer learned by maximizing the likelihood of trajectories in the expert demonstrations under language guidance. Given an image of the current state and the task description, we use the world model to train the joint policy and produce the image sequence as the answer by running the converged policy on the dynamics model. The empirical results demonstrate that this framework can improve the answers for multi-agent decision-making problems by showing superior performance on the training and unseen tasks of the StarCraft Multi-Agent Challenge benchmark. In particular, it can generate consistent interaction sequences and explainable reward functions at interaction states, opening the path for training generative models of the future.
comment: The Thirty-eighth Annual Conference on Neural Information Processing Systems
Agents' Room: Narrative Generation through Multi-step Collaboration ICLR 2025
Writing compelling fiction is a multifaceted process combining elements such as crafting a plot, developing interesting characters, and using evocative language. While large language models (LLMs) show promise for story writing, they currently rely heavily on intricate prompting, which limits their use. We propose Agents' Room, a generation framework inspired by narrative theory, that decomposes narrative writing into subtasks tackled by specialized agents. To illustrate our method, we introduce Tell Me A Story, a high-quality dataset of complex writing prompts and human-written stories, and a novel evaluation framework designed specifically for assessing long narratives. We show that Agents' Room generates stories that are preferred by expert evaluators over those produced by baseline systems by leveraging collaboration and specialization to decompose the complex story writing task into tractable components. We provide extensive analysis with automated and human-based metrics of the generated output.
comment: Under review as a conference paper at ICLR 2025
Learning Emergence of Interaction Patterns across Independent RL Agents in Multi-Agent Environments
Many real-world problems, such as controlling swarms of drones and urban traffic, naturally lend themselves to modeling as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods often suffer from scalability challenges, primarily due to the introduction of communication among agents. Consequently, a key challenge lies in adapting the success of deep learning in single-agent RL to the multi-agent setting. In response to this challenge, we propose an approach that fundamentally reimagines multi-agent environments. Unlike conventional methods that model each agent individually with separate networks, our approach, the Bottom Up Network (BUN), adopts a unique perspective. BUN treats the collective of multi-agents as a unified entity while employing a specialized weight initialization strategy that promotes independent learning. Furthermore, we dynamically establish connections among agents using gradient information, enabling coordination when necessary while maintaining these connections as limited and sparse to effectively manage the computational budget. Our extensive empirical evaluations across a variety of cooperative multi-agent scenarios, including tasks such as cooperative navigation and traffic control, consistently demonstrate BUN's superiority over baseline methods with substantially reduced computational costs.
comment: 13 pages, 24 figures
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
With expansive state-action spaces, efficient multi-agent exploration remains a longstanding challenge in reinforcement learning. Although pursuing novelty, diversity, or uncertainty attracts increasing attention, redundant efforts brought by exploration without proper guidance choices poses a practical issue for the community. This paper introduces a systematic approach, termed LEMAE, choosing to channel informative task-relevant guidance from a knowledgeable Large Language Model (LLM) for Efficient Multi-Agent Exploration. Specifically, we ground linguistic knowledge from LLM into symbolic key states, that are critical for task fulfillment, in a discriminative manner at low LLM inference costs. To unleash the power of key states, we design Subspace-based Hindsight Intrinsic Reward (SHIR) to guide agents toward key states by increasing reward density. Additionally, we build the Key State Memory Tree (KSMT) to track transitions between key states in a specific task for organized exploration. Benefiting from diminishing redundant explorations, LEMAE outperforms existing SOTA approaches on the challenging benchmarks (e.g., SMAC and MPE) by a large margin, achieving a 10x acceleration in certain scenarios.
SwarmCVT: Centroidal Voronoi Tessellation-Based Path Planning for Very-Large-Scale Robotics
Swarm robotics, or very large-scale robotics (VLSR), has many meaningful applications for complicated tasks. However, the complexity of motion control and energy costs stack up quickly as the number of robots increases. In addressing this problem, our previous studies have formulated various methods employing macroscopic and microscopic approaches. These methods enable microscopic robots to adhere to a reference Gaussian mixture model (GMM) distribution observed at the macroscopic scale. As a result, optimizing the macroscopic level will result in an optimal overall result. However, all these methods require systematic and global generation of Gaussian components (GCs) within obstacle-free areas to construct the GMM trajectories. This work utilizes centroidal Voronoi tessellation to generate GCs methodically. Consequently, it demonstrates performance improvement while also ensuring consistency and reliability.
comment: Submitted to American Control Conference (ACC) 2025
Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
Recent advancements in large language model (LLM)-powered agents have shown that collective intelligence can significantly outperform individual capabilities, largely attributed to the meticulously designed inter-agent communication topologies. Though impressive in performance, existing multi-agent pipelines inherently introduce substantial token overhead, as well as increased economic costs, which pose challenges for their large-scale deployments. In response to this challenge, we propose an economical, simple, and robust multi-agent communication framework, termed $\texttt{AgentPrune}$, which can seamlessly integrate into mainstream multi-agent systems and prunes redundant or even malicious communication messages. Technically, $\texttt{AgentPrune}$ is the first to identify and formally define the \textit{communication redundancy} issue present in current LLM-based multi-agent pipelines, and efficiently performs one-shot pruning on the spatial-temporal message-passing graph, yielding a token-economic and high-performing communication topology. Extensive experiments across six benchmarks demonstrate that $\texttt{AgentPrune}$ \textbf{(I)} achieves comparable results as state-of-the-art topologies at merely $\$5.6$ cost compared to their $\$43.7$, \textbf{(II)} integrates seamlessly into existing multi-agent frameworks with $28.1\%\sim72.8\%\downarrow$ token reduction, and \textbf{(III)} successfully defend against two types of agent-based adversarial attacks with $3.5\%\sim10.8\%\uparrow$ performance boost.
Agent-Oriented Planning in Multi-Agent Systems
Through the collaboration of multiple agents possessing diverse expertise and tools, multi-agent systems achieve impressive progress in solving real-world problems. Given the user queries, the meta-agents, serving as the brain within these systems, are required to decompose the queries into multiple sub-tasks that can be allocated to suitable agents capable of solving them, so-called agent-oriented planning. In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy, to ensure that each sub-task is effectively resolved, leading to satisfactory responses to the original queries. These principles further inspire us to propose a novel framework for agent-oriented planning in multi-agent systems, leveraging a fast task decomposition and allocation process followed by an effective and efficient evaluation via a reward model. During the planning process, the meta-agent is also responsible for evaluating the performance of the expert agents, making timely adjustments to the sub-tasks and scheduling as necessary. Besides, we integrate a feedback loop into the proposed framework to further enhance the effectiveness and robustness of such a problem-solving process. Extensive experiments demonstrate the advancement of the proposed framework in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline, such as optimal model search and hyperparameter tuning. Existing AutoML systems often require technical expertise to set up complex tools, which is in general time-consuming and requires a large amount of human effort. Therefore, recent works have started exploiting large language models (LLM) to lessen such burden and increase the usability of AutoML frameworks via a natural language interface, allowing non-expert users to build their data-driven solutions. These methods, however, are usually designed only for a particular process in the AI development pipeline and do not efficiently use the inherent capacity of the LLMs. This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML, i.e., from data retrieval to model deployment. AutoML-Agent takes user's task descriptions, facilitates collaboration between specialized LLM agents, and delivers deployment-ready models. Unlike existing work, instead of devising a single plan, we introduce a retrieval-augmented planning strategy to enhance exploration to search for more optimal plans. We also decompose each plan into sub-tasks (e.g., data preprocessing and neural network design) each of which is solved by a specialized agent we build via prompting executing in parallel, making the search process more efficient. Moreover, we propose a multi-stage verification to verify executed results and guide the code generation LLM in implementing successful solutions. Extensive experiments on seven downstream tasks using fourteen datasets show that AutoML-Agent achieves a higher success rate in automating the full AutoML process, yielding systems with good performance throughout the diverse domains.
comment: 47 pages, 5 figures
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure
One-to-one tutoring is one of the most efficient methods of teaching. Following the rise in popularity of Large Language Models (LLMs), there have been efforts to use them to create conversational tutoring systems, which can make the benefits of one-to-one tutoring accessible to everyone. However, current LLMs are primarily trained to be helpful assistants and thus lack crucial pedagogical skills. For example, they often quickly reveal the solution to the student and fail to plan for a richer multi-turn pedagogical interaction. To use LLMs in pedagogical scenarios, they need to be steered towards using effective teaching strategies: a problem we introduce as Pedagogical Steering and believe to be crucial for the efficient use of LLMs as tutors. We address this problem by formalizing a concept of tutoring strategy, and introducing StratL, an algorithm to model a strategy and use prompting to steer the LLM to follow this strategy. As a case study, we create a prototype tutor for high school math following Productive Failure (PF), an advanced and effective learning design. To validate our approach in a real-world setting, we run a field study with 17 high school students in Singapore. We quantitatively show that StratL succeeds in steering the LLM to follow a Productive Failure tutoring strategy. We also thoroughly investigate the existence of spillover effects on desirable properties of the LLM, like its ability to generate human-like answers. Based on these results, we highlight the challenges in Pedagogical Steering and suggest opportunities for further improvements. We further encourage follow-up research by releasing a dataset of Productive Failure problems and the code of our prototype and algorithm.
comment: 18 pages, 9 figures, 6 tables
CMP: Cooperative Motion Prediction with Multi-Agent Communication
The confluence of the advancement of Autonomous Vehicles (AVs) and the maturity of Vehicle-to-Everything (V2X) communication has enabled the capability of cooperative connected and automated vehicles (CAVs). Building on top of cooperative perception, this paper explores the feasibility and effectiveness of cooperative motion prediction. Our method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities. Unlike previous work that focuses separately on either cooperative perception or motion prediction, our framework, to the best of our knowledge, is the first to address the unified problem where CAVs share information in both perception and prediction modules. Incorporated into our design is the unique capability to tolerate realistic V2X bandwidth limitations and transmission delays, while dealing with bulky perception representations. We also propose a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction. Through extensive experiments and ablation studies on the OPV2V and V2V4Real datasets, we demonstrate the effectiveness of our method in cooperative perception, tracking, and motion prediction. In particular, CMP reduces the average prediction error by 16.4\% with fewer missing detections compared with the no cooperation setting and by 12.3\% compared with the strongest baseline. Our work marks a significant step forward in the cooperative capabilities of CAVs, showcasing enhanced performance in complex scenarios. The code can be found on the project website: https://cmp-cooperative-prediction.github.io/.
comment: Project website: https://cmp-cooperative-prediction.github.io/
Mean Field Correlated Imitation Learning
We investigate multi-agent imitation learning (IL) within the framework of mean field games (MFGs), considering the presence of time-varying correlated signals. Existing MFG IL algorithms assume demonstrations are sampled from Mean Field Nash Equilibria (MFNE), limiting their adaptability to real-world scenarios. For example, in the traffic network equilibrium influenced by public routing recommendations, recommendations introduce time-varying correlated signals into the game, not captured by MFNE and other existing correlated equilibrium concepts. To address this gap, we propose Adaptive Mean Field Correlated Equilibrium (AMFCE), a general equilibrium incorporating time-varying correlated signals. We establish the existence of AMFCE under mild conditions and prove that MFNE is a subclass of AMFCE. We further propose Correlated Mean Field Imitation Learning (CMFIL), a novel IL framework designed to recover the AMFCE, accompanied by a theoretical guarantee on the quality of the recovered policy. Experimental results, including a real-world traffic flow prediction problem, demonstrate the superiority of CMFIL over state-of-the-art IL baselines, highlighting the potential of CMFIL in understanding large population behavior under correlated signals.
comment: 17 pages
Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games
This paper examines multiplayer symmetric constant-sum games with more than two players in a competitive setting, including examples like Mahjong, Poker, and various board and video games. In contrast to two-player zero-sum games, equilibria in multiplayer games are neither unique nor non-exploitable, failing to provide meaningful guarantees when competing against opponents who play different equilibria or non-equilibrium strategies. This gives rise to a series of long-lasting fundamental questions in multiplayer games regarding suitable objectives, solution concepts, and principled algorithms. This paper takes an initial step towards addressing these challenges by focusing on the natural objective of equal share -- securing an expected payoff of C/n in an n-player symmetric game with a total payoff of C. We rigorously identify the theoretical conditions under which achieving an equal share is tractable and design a series of efficient algorithms, inspired by no-regret learning, that provably attain approximate equal share across various settings. Furthermore, we provide complementary lower bounds that justify the sharpness of our theoretical results. Our experimental results highlight worst-case scenarios where meta-algorithms from prior state-of-the-art systems for multiplayer games fail to secure an equal share, while our algorithm succeeds, demonstrating the effectiveness of our approach.
Understanding the Impact of Coalitions between EV Charging Stations
The rapid growth of electric vehicles (EVs) is driving the expansion of charging infrastructure globally. As charging stations become ubiquitous, their substantial electricity consumption can influence grid operation and electricity pricing. Naturally, \textit{some} groups of charging stations, which could be jointly operated by a company, may coordinate to decide their charging profile. While coordination among all charging stations is ideal, it is unclear if coordination of some charging stations is better than no coordination. In this paper, we analyze this intermediate regime between no and full coordination of charging stations. We model EV charging as a non-cooperative aggregative game, where each station's cost is determined by both monetary payments tied to reactive electricity prices on the grid and its sensitivity to deviations from a desired charging profile. We consider a solution concept that we call $\mathcal{C}$-Nash equilibrium, which is tied to a coalition $\mathcal{C}$ of charging stations coordinating to reduce their costs. We provide sufficient conditions, in terms of the demand and sensitivity of charging stations, to determine when independent (aka uncoordinated) operation of charging stations could result in lower overall costs to charging stations, coalition and charging stations outside the coalition. Somewhat counter to common intuition, we show numerical instances where allowing charging stations to operate independently is better than coordinating a subset of stations as a coalition. Jointly, these results provide operators of charging stations insights into how to coordinate their charging behavior, and open several research directions.
comment: 20 pages, 5 figures
TrustAgent: Towards Safe and Trustworthy LLM-based Agents EMNLP 2024
The rise of LLM-based agents shows great potential to revolutionize task planning, capturing significant attention. Given that these agents will be integrated into high-stake domains, ensuring their reliability and safety is crucial. This paper presents an Agent-Constitution-based agent framework, TrustAgent, with a particular focus on improving the LLM-based agent safety. The proposed framework ensures strict adherence to the Agent Constitution through three strategic components: pre-planning strategy which injects safety knowledge to the model before plan generation, in-planning strategy which enhances safety during plan generation, and post-planning strategy which ensures safety by post-planning inspection. Our experimental results demonstrate that the proposed framework can effectively enhance an LLM agent's safety across multiple domains by identifying and mitigating potential dangers during the planning. Further analysis reveals that the framework not only improves safety but also enhances the helpfulness of the agent. Additionally, we highlight the importance of the LLM reasoning ability in adhering to the Constitution. This paper sheds light on how to ensure the safe integration of LLM-based agents into human-centric environments. Data and code are available at https://github.com/agiresearch/TrustAgent.
comment: In EMNLP 2024
Second-Order Algorithms for Finding Local Nash Equilibria in Zero-Sum Games
Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors. To overcome this challenge, algorithms must account for subtleties involving the curvatures of players' costs. To this end, we leverage dynamical system theory and develop a second-order algorithm for finding a local Nash equilibrium in the smooth, possibly nonconvex-nonconcave, zero-sum game setting. First, we prove that this novel method guarantees convergence to only local Nash equilibria with a local linear convergence rate. We then interpret a version of this method as a modified Gauss-Newton algorithm with local superlinear convergence to the neighborhood of a point that satisfies first-order local Nash equilibrium conditions. In comparison, current related state-of-the-art methods do not offer convergence rate guarantees. Furthermore, we show that this approach naturally generalizes to settings with convex and potentially coupled constraints while retaining earlier guarantees of convergence to only local (generalized) Nash equilibria.
Robotics
Windowed MAPF with Completeness Guarantees
Traditional multi-agent path finding (MAPF) methods try to compute entire start-goal paths which are collision free. However, computing an entire path can take too long for MAPF systems where agents need to replan fast. Methods that address this typically employ a "windowed" approach and only try to find collision free paths for a small windowed timestep horizon. This adaptation comes at the cost of incompleteness; all current windowed approaches can become stuck in deadlock or livelock. Our main contribution is to introduce our framework, WinC-MAPF, for Windowed MAPF that enables completeness. Our framework uses heuristic update insights from single-agent real-time heuristic search algorithms as well as agent independence ideas from MAPF algorithms. We also develop Single-Step CBS (SS-CBS), an instantiation of this framework using a novel modification to CBS. We show how SS-CBS, which only plans a single step and updates heuristics, can effectively solve tough scenarios where existing windowed approaches fail.
Open Human-Robot Collaboration using Decentralized Inverse Reinforcement Learning
The growing interest in human-robot collaboration (HRC), where humans and robots cooperate towards shared goals, has seen significant advancements over the past decade. While previous research has addressed various challenges, several key issues remain unresolved. Many domains within HRC involve activities that do not necessarily require human presence throughout the entire task. Existing literature typically models HRC as a closed system, where all agents are present for the entire duration of the task. In contrast, an open model offers flexibility by allowing an agent to enter and exit the collaboration as needed, enabling them to concurrently manage other tasks. In this paper, we introduce a novel multiagent framework called oDec-MDP, designed specifically to model open HRC scenarios where agents can join or leave tasks flexibly during execution. We generalize a recent multiagent inverse reinforcement learning method - Dec-AIRL to learn from open systems modeled using the oDec-MDP. Our method is validated through experiments conducted in both a simplified toy firefighting domain and a realistic dyadic human-robot collaborative assembly. Results show that our framework and learning method improves upon its closed system counterpart.
Multi-Robot Trajectory Generation via Consensus ADMM: Convex vs. Non-Convex
C-ADMM is a well-known distributed optimization framework due to its guaranteed convergence in convex optimization problems. Recently, C-ADMM has been studied in robotics applications such as multi-vehicle target tracking and collaborative manipulation tasks. However, few works have investigated the performance of C-ADMM applied to non-convex problems in robotics applications due to a lack of theoretical guarantees. For this project, we aim to quantitatively explore and examine the convergence behavior of non-convex C-ADMM through the scope of distributed multi-robot trajectory planning. We propose a convex trajectory planning problem by leveraging C-ADMM and Buffered Voronoi Cells (BVCs) to get around the non-convex collision avoidance constraint and compare this convex C-ADMM algorithm to a non-convex C-ADMM baseline with non-convex collision avoidance constraints. We show that the convex C-ADMM algorithm requires 1000 fewer iterations to achieve convergence in a multi-robot waypoint navigation scenario. We also confirm that the non-convex C-ADMM baseline leads to sub-optimal solutions and violation of safety constraints in trajectory generation.
$\mathcal{D(R,O)}$ Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping
Dexterous grasping is a fundamental yet challenging skill in robotic manipulation, requiring precise interaction between robotic hands and objects. In this paper, we present $\mathcal{D(R,O)}$ Grasp, a novel framework that models the interaction between the robotic hand in its grasping pose and the object, enabling broad generalization across various robot hands and object geometries. Our model takes the robot hand's description and object point cloud as inputs and efficiently predicts kinematically valid and stable grasps, demonstrating strong adaptability to diverse robot embodiments and object geometries. Extensive experiments conducted in both simulated and real-world environments validate the effectiveness of our approach, with significant improvements in success rate, grasp diversity, and inference speed across multiple robotic hands. Our method achieves an average success rate of 87.53% in simulation in less than one second, tested across three different dexterous robotic hands. In real-world experiments using the LeapHand, the method also demonstrates an average success rate of 89%. $\mathcal{D(R,O)}$ Grasp provides a robust solution for dexterous grasping in complex and varied environments. The code, appendix, and videos are available on our project website at https://nus-lins-lab.github.io/drograspweb/.
Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking
3D multi-object tracking plays a critical role in autonomous driving by enabling the real-time monitoring and prediction of multiple objects' movements. Traditional 3D tracking systems are typically constrained by predefined object categories, limiting their adaptability to novel, unseen objects in dynamic environments. To address this limitation, we introduce open-vocabulary 3D tracking, which extends the scope of 3D tracking to include objects beyond predefined categories. We formulate the problem of open-vocabulary 3D tracking and introduce dataset splits designed to represent various open-vocabulary scenarios. We propose a novel approach that integrates open-vocabulary capabilities into a 3D tracking framework, allowing for generalization to unseen object classes. Our method effectively reduces the performance gap between tracking known and novel objects through strategic adaptation. Experimental results demonstrate the robustness and adaptability of our method in diverse outdoor driving scenarios. To the best of our knowledge, this work is the first to address open-vocabulary 3D tracking, presenting a significant advancement for autonomous systems in real-world settings. Code, trained models, and dataset splits are available publicly.
comment: 7 pages, 4 figures, 3 tables
One-Shot Robust Imitation Learning for Long-Horizon Visuomotor Tasks from Unsegmented Demonstrations
In contrast to single-skill tasks, long-horizon tasks play a crucial role in our daily life, e.g., a pouring task requires a proper concatenation of reaching, grasping and pouring subtasks. As an efficient solution for transferring human skills to robots, imitation learning has achieved great progress over the last two decades. However, when learning long-horizon visuomotor skills, imitation learning often demands a large amount of semantically segmented demonstrations. Moreover, the performance of imitation learning could be susceptible to external perturbation and visual occlusion. In this paper, we exploit dynamical movement primitives and meta-learning to provide a new framework for imitation learning, called Meta-Imitation Learning with Adaptive Dynamical Primitives (MiLa). MiLa allows for learning unsegmented long-horizon demonstrations and adapting to unseen tasks with a single demonstration. MiLa can also resist external disturbances and visual occlusion during task execution. Real-world robotic experiments demonstrate the superiority of MiLa, irrespective of visual occlusion and random perturbations on robots.
comment: 15 pages, 6 figures
Entropy-Based Uncertainty Modeling for Trajectory Prediction in Autonomous Driving
In autonomous driving, accurate motion prediction is essential for safe and efficient motion planning. To ensure safety, planners must rely on reliable uncertainty information about the predicted future behavior of surrounding agents, yet this aspect has received limited attention. This paper addresses the so-far neglected problem of uncertainty modeling in trajectory prediction. We adopt a holistic approach that focuses on uncertainty quantification, decomposition, and the influence of model composition. Our method is based on a theoretically grounded information-theoretic approach to measure uncertainty, allowing us to decompose total uncertainty into its aleatoric and epistemic components. We conduct extensive experiments on the nuScenes dataset to assess how different model architectures and configurations affect uncertainty quantification and model robustness.
comment: 10 pages, 5 figures, submitted to International Conference on Learning Representations (2025)
SGBA: Semantic Gaussian Mixture Model-Based LiDAR Bundle Adjustment
LiDAR bundle adjustment (BA) is an effective approach to reduce the drifts in pose estimation from the front-end. Existing works on LiDAR BA usually rely on predefined geometric features for landmark representation. This reliance restricts generalizability, as the system will inevitably deteriorate in environments where these specific features are absent. To address this issue, we propose SGBA, a LiDAR BA scheme that models the environment as a semantic Gaussian mixture model (GMM) without predefined feature types. This approach encodes both geometric and semantic information, offering a comprehensive and general representation adaptable to various environments. Additionally, to limit computational complexity while ensuring generalizability, we propose an adaptive semantic selection framework that selects the most informative semantic clusters for optimization by evaluating the condition number of the cost function. Lastly, we introduce a probabilistic feature association scheme that considers the entire probability density of assignments, which can manage uncertainties in measurement and initial pose estimation. We have conducted various experiments and the results demonstrate that SGBA can achieve accurate and robust pose refinement even in challenging scenarios with low-quality initial pose estimation and limited geometric features. We plan to open-source the work for the benefit of the community https://github.com/Ji1Xinyu/SGBA.
Computational Teaching for Driving via Multi-Task Imitation Learning
Learning motor skills for sports or performance driving is often done with professional instruction from expert human teachers, whose availability is limited. Our goal is to enable automated teaching via a learned model that interacts with the student similar to a human teacher. However, training such automated teaching systems is limited by the availability of high-quality annotated datasets of expert teacher and student interactions that are difficult to collect at scale. To address this data scarcity problem, we propose an approach for training a coaching system for complex motor tasks such as high performance driving via a Multi-Task Imitation Learning (MTIL) paradigm. MTIL allows our model to learn robust representations by utilizing self-supervised training signals from more readily available non-interactive datasets of humans performing the task of interest. We validate our approach with (1) a semi-synthetic dataset created from real human driving trajectories, (2) a professional track driving instruction dataset, (3) a track-racing driving simulator human-subject study, and (4) a system demonstration on an instrumented car at a race track. Our experiments show that the right set of auxiliary machine learning tasks improves performance in predicting teaching instructions. Moreover, in the human subjects study, students exposed to the instructions from our teaching system improve their ability to stay within track limits, and show favorable perception of the model's interaction with them, in terms of usefulness and satisfaction.
comment: 12 pages, 3 figures, 3 tables
Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
Multimodal task specification is essential for enhanced robotic performance, where \textit{Cross-modality Alignment} enables the robot to holistically understand complex task instructions. Directly annotating multimodal instructions for model training proves impractical, due to the sparsity of paired multimodal data. In this study, we demonstrate that by leveraging unimodal instructions abundant in real data, we can effectively teach robots to learn multimodal task specifications. First, we endow the robot with strong \textit{Cross-modality Alignment} capabilities, by pretraining a robotic multimodal encoder using extensive out-of-domain data. Then, we employ two Collapse and Corrupt operations to further bridge the remaining modality gap in the learned multimodal representation. This approach projects different modalities of identical task goal as interchangeable representations, thus enabling accurate robotic operations within a well-aligned multimodal latent space. Evaluation across more than 130 tasks and 4000 evaluations on both simulated LIBERO benchmark and real robot platforms showcases the superior capabilities of our proposed framework, demonstrating significant advantage in overcoming data constraints in robotic learning. Website: zh1hao.wang/Robo_MUTUAL
comment: preprint
Closed-loop Long-horizon Robotic Planning via Equilibrium Sequence Modeling
In the endeavor to make autonomous robots take actions, task planning is a major challenge that requires translating high-level task descriptions into long-horizon action sequences. Despite recent advances in language model agents, they remain prone to planning errors and limited in their ability to plan ahead. To address these limitations in robotic planning, we advocate a self-refining scheme that iteratively refines a draft plan until an equilibrium is reached. Remarkably, this process can be optimized end-to-end from an analytical perspective without the need to curate additional verifiers or reward models, allowing us to train self-refining planners in a simple supervised learning fashion. Meanwhile, a nested equilibrium sequence modeling procedure is devised for efficient closed-loop planning that incorporates useful feedback from the environment (or an internal world model). Our method is evaluated on the VirtualHome-Env benchmark, showing advanced performance with better scaling for inference computation. Code is available at https://github.com/Singularity0104/equilibrium-planner.
WiFi-CSI Sensing and Bearing Estimation in Multi-Robot Systems: An Open-Source Simulation Framework
Development and testing of multi-robot systems employing wireless signal-based sensing requires access to suitable hardware, such as channel monitoring WiFi transceivers, which can pose significant limitations. The WiFi Sensor for Robotics (WSR) toolbox, introduced by Jadhav et al. in 2022, provides a novel solution by using WiFi Channel State Information (CSI) to compute relative bearing between robots. The toolbox leverages the amplitude and phase of WiFi signals and creates virtual antenna arrays by exploiting the motion of mobile robots, eliminating the need for physical antenna arrays. However, the WSR toolbox's reliance on an obsoleting WiFi transceiver hardware has limited its operability and accessibility, hindering broader application and development of relevant tools. We present an open-source simulation framework that replicates the WSR toolbox's capabilities using Gazebo and Matlab. By simulating WiFi-CSI data collection, our framework emulates the behavior of mobile robots equipped with the WSR toolbox, enabling precise bearing estimation without physical hardware. We validate the framework through experiments with both simulated and real Turtlebot3 robots, showing a close match between the obtained CSI data and the resulting bearing estimates. This work provides a virtual environment for developing and testing WiFi-CSI-based multi-robot localization without relying on physical hardware. All code and experimental setup information are publicly available at https://github.com/BrendanxP/CSI-Simulation-Framework
comment: 6+1 pages (text + references), 6 figures
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Generalizing language-conditioned robotic policies to new tasks remains a significant challenge, hampered by the lack of suitable simulation benchmarks. In this paper, we address this gap by introducing GemBench, a novel benchmark to assess generalization capabilities of vision-language robotic manipulation policies. GemBench incorporates seven general action primitives and four levels of generalization, spanning novel placements, rigid and articulated objects, and complex long-horizon tasks. We evaluate state-of-the-art approaches on GemBench and also introduce a new method. Our approach 3D-LOTUS leverages rich 3D information for action prediction conditioned on language. While 3D-LOTUS excels in both efficiency and performance on seen tasks, it struggles with novel tasks. To address this, we present 3D-LOTUS++, a framework that integrates 3D-LOTUS's motion planning capabilities with the task planning capabilities of LLMs and the object grounding accuracy of VLMs. 3D-LOTUS++ achieves state-of-the-art performance on novel tasks of GemBench, setting a new standard for generalization in robotic manipulation. The benchmark, codes and trained models are available at \url{https://www.di.ens.fr/willow/research/gembench/}.
ReFeree: Radar-Based Lightweight and Robust Localization using Feature and Free space
Place recognition plays an important role in achieving robust long-term autonomy. Real-world robots face a wide range of weather conditions (e.g. overcast, heavy rain, and snowing) and most sensors (i.e. camera, LiDAR) essentially functioning within or near-visible electromagnetic waves are sensitive to adverse weather conditions, making reliable localization difficult. In contrast, radar is gaining traction due to long electromagnetic waves, which are less affected by environmental changes and weather independence. In this work, we propose a radar-based lightweight and robust place recognition. We achieve rotational invariance and lightweight by selecting a one-dimensional ring-shaped description and robustness by mitigating the impact of false detection utilizing opposite noise characteristics between free space and feature. In addition, the initial heading can be estimated, which can assist in building a SLAM pipeline that combines odometry and registration, which takes into account onboard computing. The proposed method was tested for rigorous validation across various scenarios (i.e. single session, multi-session, and different weather conditions). In particular, we validate our descriptor achieving reliable place recognition performance through the results of extreme environments that lacked structural information such as an OORD dataset.
comment: 8 pages, 8 figures, accepted to RA-L
Finetuning Pre-trained Model with Limited Data for LiDAR-based 3D Object Detection by Bridging Domain Gaps IROS
LiDAR-based 3D object detectors have been largely utilized in various applications, including autonomous vehicles or mobile robots. However, LiDAR-based detectors often fail to adapt well to target domains with different sensor configurations (e.g., types of sensors, spatial resolution, or FOVs) and location shifts. Collecting and annotating datasets in a new setup is commonly required to reduce such gaps, but it is often expensive and time-consuming. Recent studies suggest that pre-trained backbones can be learned in a self-supervised manner with large-scale unlabeled LiDAR frames. However, despite their expressive representations, they remain challenging to generalize well without substantial amounts of data from the target domain. Thus, we propose a novel method, called Domain Adaptive Distill-Tuning (DADT), to adapt a pre-trained model with limited target data (approximately 100 LiDAR frames), retaining its representation power and preventing it from overfitting. Specifically, we use regularizers to align object-level and context-level representations between the pre-trained and finetuned models in a teacher-student architecture. Our experiments with driving benchmarks, i.e., Waymo Open dataset and KITTI, confirm that our method effectively finetunes a pre-trained model, achieving significant gains in accuracy.
comment: Accepted in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024
Robust Imitation Learning for Mobile Manipulator Focusing on Task-Related Viewpoints and Regions
We study how to generalize the visuomotor policy of a mobile manipulator from the perspective of visual observations. The mobile manipulator is prone to occlusion owing to its own body when only a single viewpoint is employed and a significant domain shift when deployed in diverse situations. However, to the best of the authors' knowledge, no study has been able to solve occlusion and domain shift simultaneously and propose a robust policy. In this paper, we propose a robust imitation learning method for mobile manipulators that focuses on task-related viewpoints and their spatial regions when observing multiple viewpoints. The multiple viewpoint policy includes attention mechanism, which is learned with an augmented dataset, and brings optimal viewpoints and robust visual embedding against occlusion and domain shift. Comparison of our results for different tasks and environments with those of previous studies revealed that our proposed method improves the success rate by up to 29.3 points. We also conduct ablation studies using our proposed method. Learning task-related viewpoints from the multiple viewpoints dataset increases robustness to occlusion than using a uniquely defined viewpoint. Focusing on task-related regions contributes to up to a 33.3-point improvement in the success rate against domain shift.
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction
Real-life robot navigation involves more than just reaching a destination; it requires optimizing movements while addressing scenario-specific goals. An intuitive way for humans to express these goals is through abstract cues like verbal commands or rough sketches. Such human guidance may lack details or be noisy. Nonetheless, we expect robots to navigate as intended. For robots to interpret and execute these abstract instructions in line with human expectations, they must share a common understanding of basic navigation concepts with humans. To this end, we introduce CANVAS, a novel framework that combines visual and linguistic instructions for commonsense-aware navigation. Its success is driven by imitation learning, enabling the robot to learn from human navigation behavior. We present COMMAND, a comprehensive dataset with human-annotated navigation results, spanning over 48 hours and 219 km, designed to train commonsense-aware navigation systems in simulated environments. Our experiments show that CANVAS outperforms the strong rule-based system ROS NavStack across all environments, demonstrating superior performance with noisy instructions. Notably, in the orchard environment, where ROS NavStack records a 0% total success rate, CANVAS achieves a total success rate of 67%. CANVAS also closely aligns with human demonstrations and commonsense constraints, even in unseen environments. Furthermore, real-world deployment of CANVAS showcases impressive Sim2Real transfer with a total success rate of 69%, highlighting the potential of learning from human demonstrations in simulated environments for real-world applications.
comment: project page https://worv-ai.github.io/canvas
High and Low Resolution Tradeoffs in Roadside Multimodal Sensing
Designing roadside sensing for intelligent transportation applications requires balancing cost and performance,especially when choosing between high and low-resolution sensors. The tradeoff is challenging due to sensor heterogeneity,where different sensors produce unique data modalities due to varying physical principles. High-resolution LiDAR offers detailed point cloud, while 4D millimeter-wave radar, despite providing sparser data, delivers velocity information useful for distinguishing objects based on movement patterns. To assess whether reductions in spatial resolution can be compensated by the informational richness of sensors, particularly in recognizing both vehicles and vulnerable road users (VRUs), we propose Residual Fusion Net (ResFusionNet) to fuse multimodal data for 3D object detection. This enables a quantifiable tradeoff between spatial resolution and information richness across different modalities. Furthermore, we introduce a sensor placement algorithm utilizing probabilistic modeling to manage uncertainties in sensor visibility influenced by environmental or human-related factors. Through simulation-assisted ex-ante evaluation on a real-world testbed, our findings show marked marginal gains in detecting VRUs--an average of 16.7% for pedestrians and 11% for cyclists--when merging velocity-encoded radar with LiDAR, compared to LiDAR only configurations. Additionally, experimental results from 300 runs reveal a maximum loss of 11.5% and a average of 5.25% in sensor coverage due to uncertainty factors. These findings underscore the potential of using low spatial resolution but information-rich sensors to enhance detection capabilities for vulnerable road users while highlighting the necessity of thoroughly evaluating sensor modality heterogeneity, traffic participant diversity, and operational uncertainties when making sensor tradeoffs in practical applications.
comment: 7 pages, 8 figures
Towards Efficient Moion Planning for UAVs: Lazy A* Search with Motion Primitives
Search-based motion planning algorithms have been widely utilized for unmanned aerial vehicles (UAVs). However, deploying these algorithms on real UAVs faces challenges due to limited onboard computational resources. The algorithms struggle to find solutions in high-dimensional search spaces and require considerable time to ensure that the trajectories are dynamically feasible. This paper incorporates the lazy search concept into search-based planning algorithms to address the critical issue of real-time planning for collision-free and dynamically feasible trajectories on UAVs. We demonstrate that the lazy search motion planning algorithm can efficiently find optimal trajectories and significantly improve computational efficiency.
Effective Tuning Strategies for Generalist Robot Manipulation Policies
Generalist robot manipulation policies (GMPs) have the potential to generalize across a wide range of tasks, devices, and environments. However, existing policies continue to struggle with out-of-distribution scenarios due to the inherent difficulty of collecting sufficient action data to cover extensively diverse domains. While fine-tuning offers a practical way to quickly adapt a GMPs to novel domains and tasks with limited samples, we observe that the performance of the resulting GMPs differs significantly with respect to the design choices of fine-tuning strategies. In this work, we first conduct an in-depth empirical study to investigate the effect of key factors in GMPs fine-tuning strategies, covering the action space, policy head, supervision signal and the choice of tunable parameters, where 2,500 rollouts are evaluated for a single configuration. We systematically discuss and summarize our findings and identify the key design choices, which we believe give a practical guideline for GMPs fine-tuning. We observe that in a low-data regime, with carefully chosen fine-tuning strategies, a GMPs significantly outperforms the state-of-the-art imitation learning algorithms. The results presented in this work establish a new baseline for future studies on fine-tuned GMPs, and provide a significant addition to the GMPs toolbox for the community.
StraightTrack: Towards Mixed Reality Navigation System for Percutaneous K-wire Insertion
In percutaneous pelvic trauma surgery, accurate placement of Kirschner wires (K-wires) is crucial to ensure effective fracture fixation and avoid complications due to breaching the cortical bone along an unsuitable trajectory. Surgical navigation via mixed reality (MR) can help achieve precise wire placement in a low-profile form factor. Current approaches in this domain are as yet unsuitable for real-world deployment because they fall short of guaranteeing accurate visual feedback due to uncontrolled bending of the wire. To ensure accurate feedback, we introduce StraightTrack, an MR navigation system designed for percutaneous wire placement in complex anatomy. StraightTrack features a marker body equipped with a rigid access cannula that mitigates wire bending due to interactions with soft tissue and a covered bony surface. Integrated with an Optical See-Through Head-Mounted Display (OST HMD) capable of tracking the cannula body, StraightTrack offers real-time 3D visualization and guidance without external trackers, which are prone to losing line-of-sight. In phantom experiments with two experienced orthopedic surgeons, StraightTrack improves wire placement accuracy, achieving the ideal trajectory within $5.26 \pm 2.29$ mm and $2.88 \pm 1.49$ degree, compared to over 12.08 mm and 4.07 degree for comparable methods. As MR navigation systems continue to mature, StraightTrack realizes their potential for internal fracture fixation and other percutaneous orthopedic procedures.
FeelAnyForce: Estimating Contact Force Feedback from Tactile Sensation for Vision-Based Tactile Sensors
In this paper, we tackle the problem of estimating 3D contact forces using vision-based tactile sensors. In particular, our goal is to estimate contact forces over a large range (up to 15 N) on any objects while generalizing across different vision-based tactile sensors. Thus, we collected a dataset of over 200K indentations using a robotic arm that pressed various indenters onto a GelSight Mini sensor mounted on a force sensor and then used the data to train a multi-head transformer for force regression. Strong generalization is achieved via accurate data collection and multi-objective optimization that leverages depth contact images. Despite being trained only on primitive shapes and textures, the regressor achieves a mean absolute error of 4\% on a dataset of unseen real-world objects. We further evaluate our approach's generalization capability to other GelSight mini and DIGIT sensors, and propose a reproducible calibration procedure for adapting the pre-trained model to other vision-based sensors. Furthermore, the method was evaluated on real-world tasks, including weighing objects and controlling the deformation of delicate objects, which relies on accurate force feedback. Project webpage: http://prg.cs.umd.edu/FeelAnyForce
comment: 8 pages, 4 figures, 4 tables
Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust
Vision-language-action (VLA) models trained on large-scale internet data and robot demonstrations have the potential to serve as generalist robot policies. However, despite their large-scale training, VLAs are often brittle to task-irrelevant visual details such as distractor objects or background colors. We introduce Bring Your Own VLA (BYOVLA): a run-time intervention scheme that (1) dynamically identifies regions of the input image that the model is sensitive to, and (2) minimally alters task-irrelevant regions to reduce the model's sensitivity using automated image editing tools. Our approach is compatible with any off the shelf VLA without model fine-tuning or access to the model's weights. Hardware experiments on language-instructed manipulation tasks demonstrate that BYOVLA enables state-of-the-art VLA models to nearly retain their nominal performance in the presence of distractor objects and backgrounds, which otherwise degrade task success rates by up to 40%. Website with additional information, videos, and code: https://aasherh.github.io/byovla/ .
comment: Website: https://aasherh.github.io/byovla/
Bi-Level Motion Imitation for Humanoid Robots
Imitation learning from human motion capture (MoCap) data provides a promising way to train humanoid robots. However, due to differences in morphology, such as varying degrees of joint freedom and force limits, exact replication of human behaviors may not be feasible for humanoid robots. Consequently, incorporating physically infeasible MoCap data in training datasets can adversely affect the performance of the robot policy. To address this issue, we propose a bi-level optimization-based imitation learning framework that alternates between optimizing both the robot policy and the target MoCap data. Specifically, we first develop a generative latent dynamics model using a novel self-consistent auto-encoder, which learns sparse and structured motion representations while capturing desired motion patterns in the dataset. The dynamics model is then utilized to generate reference motions while the latent representation regularizes the bi-level motion imitation process. Simulations conducted with a realistic model of a humanoid robot demonstrate that our method enhances the robot policy by modifying reference motions to be physically consistent.
comment: CoRL 2024
Language Supervised Human Action Recognition with Salient Fusion: Construction Worker Action Recognition as a Use Case
Detecting human actions is a crucial task for autonomous robots and vehicles, often requiring the integration of various data modalities for improved accuracy. In this study, we introduce a novel approach to Human Action Recognition (HAR) based on skeleton and visual cues. Our method leverages a language model to guide the feature extraction process in the skeleton encoder. Specifically, we employ learnable prompts for the language model conditioned on the skeleton modality to optimize feature representation. Furthermore, we propose a fusion mechanism that combines dual-modality features using a salient fusion module, incorporating attention and transformer mechanisms to address the modalities' high dimensionality. This fusion process prioritizes informative video frames and body joints, enhancing the recognition accuracy of human actions. Additionally, we introduce a new dataset tailored for real-world robotic applications in construction sites, featuring visual, skeleton, and depth data modalities, named VolvoConstAct. This dataset serves to facilitate the training and evaluation of machine learning models to instruct autonomous construction machines for performing necessary tasks in the real world construction zones. To evaluate our approach, we conduct experiments on our dataset as well as three widely used public datasets, NTU-RGB+D, NTU-RGB+D120 and NW-UCLA. Results reveal that our proposed method achieves promising performance across all datasets, demonstrating its robustness and potential for various applications. The codes and dataset are available at: https://mmahdavian.github.io/ls_har/
Learning-Based Autonomous Navigation, Benchmark Environments and Simulation Framework for Endovascular Interventions
Endovascular interventions are a life-saving treatment for many diseases, yet suffer from drawbacks such as radiation exposure and potential scarcity of proficient physicians. Robotic assistance during these interventions could be a promising support towards these problems. Research focusing on autonomous endovascular interventions utilizing artificial intelligence-based methodologies is gaining popularity. However, variability in assessment environments hinders the ability to compare and contrast the efficacy of different approaches, primarily due to each study employing a unique evaluation framework. In this study, we present deep reinforcement learning-based autonomous endovascular device navigation on three distinct digital benchmark interventions: BasicWireNav, ArchVariety, and DualDeviceNav. The benchmark interventions were implemented with our modular simulation framework stEVE (simulated EndoVascular Environment). Autonomous controllers were trained solely in simulation and evaluated in simulation and on physical test benches with camera and fluoroscopy feedback. Autonomous control for BasicWireNav and ArchVariety reached high success rates and was successfully transferred from the simulated training environment to the physical test benches, while autonomous control for DualDeviceNav reached a moderate success rate. The experiments demonstrate the feasibility of stEVE and its potential for transferring controllers trained in simulation to real-world scenarios. Nevertheless, they also reveal areas that offer opportunities for future research. This study demonstrates the transferability of autonomous controllers from simulation to the real world in endovascular navigation and lowers the entry barriers and increases the comparability of research on endovascular assistance systems by providing open-source training scripts, benchmarks and the stEVE framework.
Equality Constrained Diffusion for Direct Trajectory Optimization
The recent success of diffusion-based generative models in image and natural language processing has ignited interest in diffusion-based trajectory optimization for nonlinear control systems. Existing methods cannot, however, handle the nonlinear equality constraints necessary for direct trajectory optimization. As a result, diffusion-based trajectory optimizers are currently limited to shooting methods, where the nonlinear dynamics are enforced by forward rollouts. This precludes many of the benefits enjoyed by direct methods, including flexible state constraints, reduced numerical sensitivity, and easy initial guess specification. In this paper, we present a method for diffusion-based optimization with equality constraints. This allows us to perform direct trajectory optimization, enforcing dynamic feasibility with constraints rather than rollouts. To the best of our knowledge, this is the first diffusion-based optimization algorithm that supports the general nonlinear equality constraints required for direct trajectory optimization.
Topological mapping for traversability-aware long-range navigation in off-road terrain
Autonomous robots navigating in off-road terrain like forests open new opportunities for automation. While off-road navigation has been studied, existing work often relies on clearly delineated pathways. We present a method allowing for long-range planning, exploration and low-level control in unknown off-trail forest terrain, using vision and GPS only. We represent outdoor terrain with a topological map, which is a set of panoramic snapshots connected with edges containing traversability information. A novel traversability analysis method is demonstrated, predicting the existence of a safe path towards a target in an image. Navigating between nodes is done using goal-conditioned behavior cloning, leveraging the power of a pretrained vision transformer. An exploration planner is presented, efficiently covering an unknown off-road area with unknown traversability using a frontiers-based approach. The approach is successfully deployed to autonomously explore two 400 meters squared forest sites unseen during training, in difficult conditions for navigation.
High-order regularization dealing with ill-conditioned robot localization problems
In this work, we propose a high-order regularization method to solve the ill-conditioned problems in robot localization. Numerical solutions to robot localization problems are often unstable when the problems are ill-conditioned. A typical way to solve ill-conditioned problems is regularization, and a classical regularization method is the Tikhonov regularization. It is shown that the Tikhonov regularization can be seen as a low-order case of our method. We find that the proposed method is superior to the Tikhonov regularization in approximating some ill-conditioned inverse problems, such as robot localization problems. The proposed method overcomes the over-smoothing problem in the Tikhonov regularization as it can use more than one term in the approximation of the matrix inverse, and an explanation for the over-smoothing of the Tikhonov regularization is given. Moreover, one a priori criterion which improves the numerical stability of the ill-conditioned problem is proposed to obtain an optimal regularization matrix. As most of the regularization solutions are biased, we also provide two bias-correction techniques for the proposed high-order regularization. The simulation and experiment results using a sensor network in a 3D environment are discussed, demonstrating the performance of the proposed method.
Get It For Free: Radar Segmentation without Expert Labels and Its Application in Odometry and Localization
This paper presents a novel weakly supervised semantic segmentation method for radar segmentation, where the existing LiDAR semantic segmentation models are employed to generate semantic labels, which then serve as supervision signals for training a radar semantic segmentation model. The obtained radar semantic segmentation model outperforms LiDAR-based models, providing more consistent and robust segmentation under all-weather conditions, particularly in the snow, rain and fog. To mitigate potential errors in LiDAR semantic labels, we design a dedicated refinement scheme that corrects erroneous labels based on structural features and distribution patterns. The semantic information generated by our radar segmentation model is used in two downstream tasks, achieving significant performance improvements. In large-scale radar-based localization using OpenStreetMap, it leads to localization error reduction by 20.55\% over prior methods. For the odometry task, it improves translation accuracy by 16.4\% compared to the second-best method, securing the first place in the radar odometry competition at the Radar in Robotics workshop of ICRA 2024, Japan
Tool-Planner: Task Planning with Clusters across Multiple Tools
Large language models (LLMs) have demonstrated exceptional reasoning capabilities, enabling them to solve various complex problems. Recently, this ability has been applied to the paradigm of tool learning. Tool learning involves providing examples of tool usage and their corresponding functions, allowing LLMs to formulate plans and demonstrate the process of invoking and executing each tool. LLMs can address tasks that they cannot complete independently, thereby enhancing their potential across different tasks. However, this approach faces two key challenges. First, redundant error correction leads to unstable planning and long execution time. Additionally, designing a correct plan among multiple tools is also a challenge in tool learning. To address these issues, we propose Tool-Planner, a task-processing framework based on toolkits. Tool-Planner groups tools based on the API functions with the same function into a toolkit and allows LLMs to implement planning across the various toolkits. When a tool error occurs, the language model can reselect and adjust tools based on the toolkit. Experiments show that our approach demonstrates a high pass and win rate across different datasets and optimizes the planning scheme for tool learning in models such as GPT-4 and Claude 3, showcasing the potential of our method. Our code is public at \url{https://github.com/OceannTwT/Tool-Planner}
comment: 48pages second version
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Robots' ability to follow language instructions and execute diverse 3D tasks is vital in robot learning. Traditional imitation learning-based methods perform well on seen tasks but struggle with novel, unseen ones due to variability. Recent approaches leverage large foundation models to assist in understanding novel tasks, thereby mitigating this issue. However, these methods lack a task-specific learning process, which is essential for an accurate understanding of 3D environments, often leading to execution failures. In this paper, we introduce GravMAD, a sub-goal-driven, language-conditioned action diffusion framework that combines the strengths of imitation learning and foundation models. Our approach breaks tasks into sub-goals based on language instructions, allowing auxiliary guidance during both training and inference. During training, we introduce Sub-goal Keypose Discovery to identify key sub-goals from demonstrations. Inference differs from training, as there are no demonstrations available, so we use pre-trained foundation models to bridge the gap and identify sub-goals for the current task. In both phases, GravMaps are generated from sub-goals, providing flexible 3D spatial guidance compared to fixed 3D positions. Empirical evaluations on RLBench show that GravMAD significantly outperforms state-of-the-art methods, with a 28.63% improvement on novel tasks and a 13.36% gain on tasks encountered during training. These results demonstrate GravMAD's strong multi-task learning and generalization in 3D manipulation. Video demonstrations are available at: https://gravmad.github.io.
comment: Under review
Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration
The emergence of multi-agent reinforcement learning (MARL) is significantly transforming various fields like autonomous vehicle networks. However, real-world multi-agent systems typically contain multiple roles, and the scale of these systems dynamically fluctuates. Consequently, in order to achieve zero-shot scalable collaboration, it is essential that strategies for different roles can be updated flexibly according to the scales, which is still a challenge for current MARL frameworks. To address this, we propose a novel MARL framework named Scalable and Heterogeneous Proximal Policy Optimization (SHPPO), integrating heterogeneity into parameter-shared PPO-based MARL networks. We first leverage a latent network to learn strategy patterns for each agent adaptively. Second, we introduce a heterogeneous layer to be inserted into decision-making networks, whose parameters are specifically generated by the learned latent variables. Our approach is scalable as all the parameters are shared except for the heterogeneous layer, and gains both inter-individual and temporal heterogeneity, allowing SHPPO to adapt effectively to varying scales. SHPPO exhibits superior performance in classic MARL environments like Starcraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF), showcasing enhanced zero-shot scalability, and offering insights into the learned latent variables' impact on team performance by visualization.
Sensory Glove-Based Surgical Robot User Interface ICRA
Robotic surgery has reached a high level of maturity and has become an integral part of standard surgical care. However, existing surgeon consoles are bulky, take up valuable space in the operating room, make surgical team coordination challenging, and their proprietary nature makes it difficult to take advantage of recent technological advances, especially in virtual and augmented reality. One potential area for further improvement is the integration of modern sensory gloves into robotic platforms, allowing surgeons to control robotic arms intuitively with their hand movements. We propose one such system that combines an HTC Vive tracker, a Manus Meta Prime 3 XR sensory glove, and SCOPEYE wireless smart glasses. The system controls one arm of a da Vinci surgical robot. In addition to moving the arm, the surgeon can use fingers to control the end-effector of the surgical instrument. Hand gestures are used to implement clutching and similar functions. In particular, we introduce clutching of the instrument orientation, a functionality unavailable in the da Vinci system. The vibrotactile elements of the glove are used to provide feedback to the user when gesture commands are invoked. A qualitative and quantitative evaluation has been conducted that compares the current device with the dVRK console. The system is shown to have excellent tracking accuracy, and the new interface allows surgeons to perform common surgical training tasks with minimal practice efficiently.
comment: 6 pages, 4 figures, 7 tables, submitted to International Conference on Robotics and Automation (ICRA) 2025
CaRtGS: Computational Alignment for Real-Time Gaussian Splatting SLAM
Simultaneous Localization and Mapping (SLAM) is pivotal in robotics, with photorealistic scene reconstruction emerging as a key challenge. To address this, we introduce Computational Alignment for Real-Time Gaussian Splatting SLAM (CaRtGS), a novel method enhancing the efficiency and quality of photorealistic scene reconstruction in real-time environments. Leveraging 3D Gaussian Splatting (3DGS), CaRtGS achieves superior rendering quality and processing speed, which is crucial for scene photorealistic reconstruction. Our approach tackles computational misalignment in Gaussian Splatting SLAM (GS-SLAM) through an adaptive strategy that optimizes training, addresses long-tail optimization, and refines densification. Experiments on Replica and TUM-RGBD datasets demonstrate CaRtGS's effectiveness in achieving high-fidelity rendering with fewer Gaussian primitives. This work propels SLAM towards real-time, photorealistic dense rendering, significantly advancing photorealistic scene representation. For the benefit of the research community, we release the code on our project website: https://dapengfeng.github.io/cartgs.
comment: Upon a thorough internal review, we have identified that our manuscript lacks proper citation for a critical expression within the methodology section. In this revised version, we add Taming-3DGS as a citation in the splat-wise backpropagation statement
VLM-Auto: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes
Recent research on Large Language Models for autonomous driving shows promise in planning and control. However, high computational demands and hallucinations still challenge accurate trajectory prediction and control signal generation. Deterministic algorithms offer reliability but lack adaptability to complex driving scenarios and struggle with context and uncertainty. To address this problem, we propose VLM-Auto, a novel autonomous driving assistant system to empower the autonomous vehicles with adjustable driving behaviors based on the understanding of road scenes. A pipeline involving the CARLA simulator and Robot Operating System 2 (ROS2) verifying the effectiveness of our system is presented, utilizing a single Nvidia 4090 24G GPU while exploiting the capacity of textual output of the Visual Language Model (VLM). Besides, we also contribute a dataset containing an image set and a corresponding prompt set for fine-tuning the VLM module of our system. In CARLA experiments, our system achieved $97.82\%$ average precision on 5 types of labels in our dataset. In the real-world driving dataset, our system achieved $96.97\%$ prediction accuracy in night scenes and gloomy scenes. Our VLM-Auto dataset will be released at https://github.com/ZionGo6/VLM-Auto.
comment: The paper is accepted by the IEEE conference
Ankle Exoskeletons May Hinder Standing Balance in Simple Models of Older and Younger Adults
Humans rely on ankle torque to maintain standing balance, particularly in the presence of small to moderate perturbations. Reductions in maximum torque (MT) production and maximum rate of torque development (MRTD) occur at the ankle with age, diminishing stability. Ankle exoskeletons are powered orthotic devices that may assist older adults by compensating for reduced muscle force and power production capabilities. They may also be able to assist with ankle strategies used for balance. However, no studies have investigated the effect of such devices on balance in older adults. Here, we model the effect ankle exoskeletons have on stability in physics-based models of healthy young and old adults, focusing on the mitigation of age-related deficits such as reduced MT and MRTD. We show that an ankle exoskeleton moderately reduces feasible stability boundaries in users who have full ankle strength. For individuals with age-related deficits, there is a trade-off. While exoskeletons augment stability in low velocity conditions, they reduce stability in some high velocity conditions. Our results suggest that well-established control strategies must still be experimentally validated in older adults.
comment: 14 pages, 7 figures
Rapid Gyroscope Calibration: A Deep Learning Approach
Low-cost gyroscope calibration is essential for ensuring the accuracy and reliability of gyroscope measurements. Stationary calibration estimates the deterministic parts of measurement errors. To this end, a common practice is to average the gyroscope readings during a predefined period and estimate the gyroscope bias. Calibration duration plays a crucial role in performance, therefore, longer periods are preferred. However, some applications require quick startup times and calibration is therefore allowed only for a short time. In this work, we focus on reducing low-cost gyroscope calibration time using deep learning methods. We propose a deep-learning framework and explore the possibilities of using multiple real and virtual gyroscopes to improve the calibration performance of single gyroscopes. To train and validate our approach, we recorded a dataset consisting of 169 hours of gyroscope readings, using 24 gyroscopes of two different brands. We also created a virtual dataset consisting of simulated gyroscope readings. The two datasets were used to evaluate our proposed approach. One of our key achievements in this work is reducing gyroscope calibration time by up to 89% using three low-cost gyroscopes.
comment: 10 Pages, 14 Figures
DITTO: Demonstration Imitation by Trajectory Transformation IROS 2024
Teaching robots new skills quickly and conveniently is crucial for the broader adoption of robotic systems. In this work, we address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording. We propose a two-stage process. In the first stage we extract the demonstration trajectory offline. This entails segmenting manipulated objects and determining their relative motion in relation to secondary objects such as containers. In the online trajectory generation stage, we first re-detect all objects, then warp the demonstration trajectory to the current scene and execute it on the robot. To complete these steps, our method leverages several ancillary models, including those for segmentation, relative object pose estimation, and grasp prediction. We systematically evaluate different combinations of correspondence and re-detection methods to validate our design decision across a diverse range of tasks. Specifically, we collect and quantitatively test on demonstrations of ten different tasks including pick-and-place tasks as well as articulated object manipulation. Finally, we perform extensive evaluations on a real robot system to demonstrate the effectiveness and utility of our approach in real-world scenarios. We make the code publicly available at http://ditto.cs.uni-freiburg.de.
comment: 8 pages, 4 figures, 3 tables, accepted at IROS 2024
CyberCortex.AI: An AI-based Operating System for Autonomous Robotics and Complex Automation
The underlying framework for controlling autonomous robots and complex automation applications are Operating Systems (OS) capable of scheduling perception-and-control tasks, as well as providing real-time data communication to other robotic peers and remote cloud computers. In this paper, we introduce CyberCortex AI, a robotics OS designed to enable heterogeneous AI-based robotics and complex automation applications. CyberCortex AI is a decentralized distributed OS which enables robots to talk to each other, as well as to High Performance Computers (HPC) in the cloud. Sensory and control data from the robots is streamed towards HPC systems with the purpose of training AI algorithms, which are afterwards deployed on the robots. Each functionality of a robot (e.g. sensory data acquisition, path planning, motion control, etc.) is executed within a so-called DataBlock of Filters shared through the internet, where each filter is computed either locally on the robot itself, or remotely on a different robotic system. The data is stored and accessed via a so-called Temporal Addressable Memory (TAM), which acts as a gateway between each filter's input and output. CyberCortex.AI has two main components: i) the CyberCortex AI inference system, which is a real-time implementation of the DataBlock running on the robots' embedded hardware, and ii) the CyberCortex AI dojo, which runs on an HPC computer in the cloud, and it is used to design, train and deploy AI algorithms. We present a quantitative and qualitative performance analysis of the proposed approach using two collaborative robotics applications: i) a forest fires prevention system based on an Unitree A1 legged robot and an Anafi Parrot 4K drone, as well as ii) an autonomous driving system which uses CyberCortex.AI for collaborative perception and motion control.
A Parallel-in-Time Newton's Method for Nonlinear Model Predictive Control
Model predictive control (MPC) is a powerful framework for optimal control of dynamical systems. However, MPC solvers suffer from a high computational burden that restricts their application to systems with low sampling frequency. This issue is further amplified in nonlinear and constrained systems that require nesting MPC solvers within iterative procedures. In this paper, we address these issues by developing parallel-in-time algorithms for constrained nonlinear optimization problems that take advantage of massively parallel hardware to achieve logarithmic computational time scaling over the planning horizon. We develop time-parallel second-order solvers based on interior point methods and the alternating direction method of multipliers, leveraging fast convergence and lower computational cost per iteration. The parallelization is based on a reformulation of the subproblems in terms of associative operations that can be parallelized using the associative scan algorithm. We validate our approach on numerical examples of nonlinear and constrained dynamical systems.
3D Uncertain Implicit Surface Mapping using GMM and GP
In this study, we address the challenge of constructing continuous three-dimensional (3D) models that accurately represent uncertain surfaces, derived from noisy and incomplete LiDAR scanning data. Building upon our prior work, which utilized the Gaussian Process (GP) and Gaussian Mixture Model (GMM) for structured building models, we introduce a more generalized approach tailored for complex surfaces in urban scenes, where GMM Regression and GP with derivative observations are applied. A Hierarchical GMM (HGMM) is employed to optimize the number of GMM components and speed up the GMM training. With the prior map obtained from HGMM, GP inference is followed for the refinement of the final map. Our approach models the implicit surface of the geo-object and enables the inference of the regions that are not completely covered by measurements. The integration of GMM and GP yields well-calibrated uncertainties alongside the surface model, enhancing both accuracy and reliability. The proposed method is evaluated on real data collected by a mobile mapping system. Compared to the performance in mapping accuracy and uncertainty quantification of other state-of-the-art methods, the proposed method achieves lower RMSEs, higher log-likelihood values and lower computational costs for the evaluated datasets.
comment: This work has been accepted by the IEEE RA-L. Copyright may be transferred without notice, after which this version may no longer be accessible
Toward Globally Optimal State Estimation Using Automatically Tightened Semidefinite Relaxations
In recent years, semidefinite relaxations of common optimization problems in robotics have attracted growing attention due to their ability to provide globally optimal solutions. In many cases, it was shown that specific handcrafted redundant constraints are required to obtain tight relaxations and thus global optimality. These constraints are formulation-dependent and typically identified through a lengthy manual process. Instead, the present paper suggests an automatic method to find a set of sufficient redundant constraints to obtain tightness, if they exist. We first propose an efficient feasibility check to determine if a given set of variables can lead to a tight formulation. Secondly, we show how to scale the method to problems of bigger size. At no point of the process do we have to find redundant constraints manually. We showcase the effectiveness of the approach, in simulation and on real datasets, for range-based localization and stereo-based pose estimation. Finally, we reproduce semidefinite relaxations presented in recent literature and show that our automatic method always finds a smaller set of constraints sufficient for tightness than previously considered.
comment: 20 pages, 22 figures. Version history: v5 (published version T-RO), v4 (conditionally accepted version T-RO), v3 (revised version), v2 (submitted version), v1 (initial version)
High-Fidelity SLAM Using Gaussian Splatting with Rendering-Guided Densification and Regularized Optimization IROS 2024
We propose a dense RGBD SLAM system based on 3D Gaussian Splatting that provides metrically accurate pose tracking and visually realistic reconstruction. To this end, we first propose a Gaussian densification strategy based on the rendering loss to map unobserved areas and refine reobserved areas. Second, we introduce extra regularization parameters to alleviate the forgetting problem in the continuous mapping problem, where parameters tend to overfit the latest frame and result in decreasing rendering quality for previous frames. Both mapping and tracking are performed with Gaussian parameters by minimizing re-rendering loss in a differentiable way. Compared to recent neural and concurrently developed gaussian splatting RGBD SLAM baselines, our method achieves state-of-the-art results on the synthetic dataset Replica and competitive results on the real-world dataset TUM.
comment: Accepted by IROS 2024
Contact-Implicit Model Predictive Control: Controlling Diverse Quadruped Motions Without Pre-Planned Contact Modes or Trajectories
This paper presents a contact-implicit model predictive control (MPC) framework for the real-time discovery of multi-contact motions, without predefined contact mode sequences or foothold positions. This approach utilizes the contact-implicit differential dynamic programming (DDP) framework, merging the hard contact model with a linear complementarity constraint. We propose the analytical gradient of the contact impulse based on relaxed complementarity constraints to further the exploration of a variety of contact modes. By leveraging a hard contact model-based simulation and computation of search direction through a smooth gradient, our methodology identifies dynamically feasible state trajectories, control inputs, and contact forces while simultaneously unveiling new contact mode sequences. However, the broadened scope of contact modes does not always ensure real-world applicability. Recognizing this, we implemented differentiable cost terms to guide foot trajectories and make gait patterns. Furthermore, to address the challenge of unstable initial roll-outs in an MPC setting, we employ the multiple shooting variant of DDP. The efficacy of the proposed framework is validated through simulations and real-world demonstrations using a 45 kg HOUND quadruped robot, performing various tasks in simulation and showcasing actual experiments involving a forward trot and a front-leg rearing motion.
comment: This is the accepted version for The International Journal of Robotics Research (2024); published version at https://journals.sagepub.com/doi/10.1177/02783649241273645 / Videos at https://youtu.be/SXD4BJIfyoY
Narrowing your FOV with SOLiD: Spatially Organized and Lightweight Global Descriptor for FOV-constrained LiDAR Place Recognition
We often encounter limited FOV situations due to various factors such as sensor fusion or sensor mount in real-world robot navigation. However, the limited FOV interrupts the generation of descriptions and impacts place recognition adversely. Therefore, we suffer from correcting accumulated drift errors in a consistent map using LiDAR-based place recognition with limited FOV. Thus, in this paper, we propose a robust LiDAR-based place recognition method for handling narrow FOV scenarios. The proposed method establishes spatial organization based on the range-elevation bin and azimuth-elevation bin to represent places. In addition, we achieve a robust place description through reweighting based on vertical direction information. Based on these representations, our method enables addressing rotational changes and determining the initial heading. Additionally, we designed a lightweight and fast approach for the robot's onboard autonomy. For rigorous validation, the proposed method was tested across various LiDAR place recognition scenarios (i.e., single-session, multi-session, and multi-robot scenarios). To the best of our knowledge, we report the first method to cope with the restricted FOV. Our place description and SLAM codes will be released. Also, the supplementary materials of our descriptor are available at \texttt{\url{https://sites.google.com/view/lidar-solid}}.
comment: Accepted in IEEE Robotics and Automation Letters (2024)
Improving Zero-Shot ObjectNav with Generative Communication
We propose a new method for improving zero-shot ObjectNav that aims to utilize potentially available environmental percepts for navigational assistance. Our approach takes into account that the ground agent may have limited and sometimes obstructed view. Our formulation encourages Generative Communication (GC) between an assistive overhead agent with a global view containing the target object and the ground agent with an obfuscated view; both equipped with Vision-Language Models (VLMs) for vision-to-language translation. In this assisted setup, the embodied agents communicate environmental information before the ground agent executes actions towards a target. Despite the overhead agent having a global view with the target, we note a drop in performance (-13% in OSR and -13% in SPL) of a fully cooperative assistance scheme over an unassisted baseline. In contrast, a selective assistance scheme where the ground agent retains its independent exploratory behaviour shows a 10% OSR and 7.65% SPL improvement. To explain navigation performance, we analyze the GC for unique traits, quantifying the presence of hallucination and cooperation. Specifically, we identify the novel linguistic trait of preemptive hallucination in our embodied setting, where the overhead agent assumes that the ground agent has executed an action in the dialogue when it is yet to move, and note its strong correlation with navigation performance. We conduct real-world experiments and present some qualitative examples where we mitigate hallucinations via prompt finetuning to improve ObjectNav performance.
Large Language Models as Zero-Shot Human Models for Human-Robot Interaction
Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) -- which have consumed vast amounts of human-generated text data -- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.
comment: 8 pages
E2Map: Experience-and-Emotion Map for Self-Reflective Robot Navigation with Language Models
Large language models (LLMs) have shown significant potential in guiding embodied agents to execute language instructions across a range of tasks, including robotic manipulation and navigation. However, existing methods are primarily designed for static environments and do not leverage the agent's own experiences to refine its initial plans. Given that real-world environments are inherently stochastic, initial plans based solely on LLMs' general knowledge may fail to achieve their objectives, unlike in static scenarios. To address this limitation, this study introduces the Experience-and-Emotion Map (E2Map), which integrates not only LLM knowledge but also the agent's real-world experiences, drawing inspiration from human emotional responses. The proposed methodology enables one-shot behavior adjustments by updating the E2Map based on the agent's experiences. Our evaluation in stochastic navigation environments, including both simulations and real-world scenarios, demonstrates that the proposed method significantly enhances performance in stochastic environments compared to existing LLM-based approaches. Code and supplementary materials are available at https://e2map.github.io/.
comment: 19 pages, 28 figures. Project page: https://e2map.github.io
Affordance-Guided Reinforcement Learning via Visual Prompting
Robots equipped with reinforcement learning (RL) have the potential to learn a wide range of skills solely from a reward signal. However, obtaining a robust and dense reward signal for general manipulation tasks remains a challenge. Existing learning-based approaches require significant data, such as human demonstrations of success and failure, to learn task-specific reward functions. Recently, there is also a growing adoption of large multi-modal foundation models for robotics that can perform visual reasoning in physical contexts and generate coarse robot motions for manipulation tasks. Motivated by this range of capability, in this work, we present Keypoint-based Affordance Guidance for Improvements (KAGI), a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL. State-of-the-art VLMs have demonstrated impressive reasoning about affordances through keypoints in zero-shot, and we use these to define dense rewards that guide autonomous robotic learning. On real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 20K online fine-tuning steps. Additionally, we demonstrate the robustness of KAGI to reductions in the number of in-domain demonstrations used for pre-training, reaching similar performance in 35K online fine-tuning steps. Project website: https://sites.google.com/view/affordance-guided-rl
comment: 8 pages, 6 figures. Robotics: Science and Systems (RSS) 2024, Task Specification for General-Purpose Intelligent Robots & Lifelong Robot Learning Workshops
Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance
Tactile perception is a critical component of solving real-world manipulation tasks, but tactile sensors for manipulation have barriers to use such as fragility and cost. In this work, we engage a robust, low-cost tactile sensor, BeadSight, as an alternative to precise pre-calibrated sensors for a pretraining approach to manipulation. We show that tactile pretraining, even with a low-fidelity sensor as BeadSight, can improve an imitation learning agent's performance on complex manipulation tasks. We demonstrate this method against a baseline USB cable plugging task, previously achieved with a much higher precision GelSight sensor as the tactile input to pretraining. Our best BeadSight pretrained visuo-tactile agent completed the task with 70\% accuracy compared to 85\% for the best GelSight pretrained visuo-tactile agent, with vision-only inference for both.
Sequential Gaussian Variational Inference for Nonlinear State Estimation applied to Robotic Applications
Probabilistic state estimation is essential for robots navigating uncertain environments. Accurately and efficiently managing uncertainty in estimated states is key to robust robotic operation. However, nonlinearities in robotic platforms pose significant challenges that require advanced estimation techniques. Gaussian variational inference (GVI) offers an optimization perspective on the estimation problem, providing analytically tractable solutions and efficiencies derived from the geometry of Gaussian space. We propose a Sequential Gaussian Variational Inference (S-GVI) method to address nonlinearity and provide efficient sequential inference processes. Our approach integrates sequential Bayesian principles into the GVI framework, which are addressed using statistical approximations and gradient updates on the information geometry. Validations through simulations and real-world experiments demonstrate significant improvements in state estimation over the Maximum A Posteriori (MAP) estimation method.
comment: 8 pages
BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes
We present BehAV, a novel approach for autonomous robot navigation in outdoor scenes guided by human instructions and leveraging Vision Language Models (VLMs). Our method interprets human commands using a Large Language Model (LLM) and categorizes the instructions into navigation and behavioral guidelines. Navigation guidelines consist of directional commands (e.g., "move forward until") and associated landmarks (e.g., "the building with blue windows"), while behavioral guidelines encompass regulatory actions (e.g., "stay on") and their corresponding objects (e.g., "pavements"). We use VLMs for their zero-shot scene understanding capabilities to estimate landmark locations from RGB images for robot navigation. Further, we introduce a novel scene representation that utilizes VLMs to ground behavioral rules into a behavioral cost map. This cost map encodes the presence of behavioral objects within the scene and assigns costs based on their regulatory actions. The behavioral cost map is integrated with a LiDAR-based occupancy map for navigation. To navigate outdoor scenes while adhering to the instructed behaviors, we present an unconstrained Model Predictive Control (MPC)-based planner that prioritizes both reaching landmarks and following behavioral guidelines. We evaluate the performance of BehAV on a quadruped robot across diverse real-world scenarios, demonstrating a 22.49% improvement in alignment with human-teleoperated actions, as measured by Frechet distance, and achieving a 40% higher navigation success rate compared to state-of-the-art methods.
Scaling Manipulation Learning with Visual Kinematic Chain Prediction
Learning general-purpose models from diverse datasets has achieved great success in machine learning. In robotics, however, existing methods in multi-task learning are typically constrained to a single robot and workspace, while recent work such as RT-X requires a non-trivial action normalization procedure to manually bridge the gap between different action spaces in diverse environments. In this paper, we propose the visual kinematics chain as a precise and universal representation of quasi-static actions for robot learning over diverse environments, which requires no manual adjustment since the visual kinematic chains can be automatically obtained from the robot's model and camera parameters. We propose the Visual Kinematics Transformer (VKT), a convolution-free architecture that supports an arbitrary number of camera viewpoints, and that is trained with a single objective of forecasting kinematic structures through optimal point-set matching. We demonstrate the superior performance of VKT over BC transformers as a general agent on Calvin, RLBench, Open-X, and real robot manipulation tasks. Video demonstrations can be found at https://mlzxy.github.io/visual-kinetic-chain.
comment: CoRL 2024
Multiagent Systems
Windowed MAPF with Completeness Guarantees
Traditional multi-agent path finding (MAPF) methods try to compute entire start-goal paths which are collision free. However, computing an entire path can take too long for MAPF systems where agents need to replan fast. Methods that address this typically employ a "windowed" approach and only try to find collision free paths for a small windowed timestep horizon. This adaptation comes at the cost of incompleteness; all current windowed approaches can become stuck in deadlock or livelock. Our main contribution is to introduce our framework, WinC-MAPF, for Windowed MAPF that enables completeness. Our framework uses heuristic update insights from single-agent real-time heuristic search algorithms as well as agent independence ideas from MAPF algorithms. We also develop Single-Step CBS (SS-CBS), an instantiation of this framework using a novel modification to CBS. We show how SS-CBS, which only plans a single step and updates heuristics, can effectively solve tough scenarios where existing windowed approaches fail.
Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning
Despite often being perceived as morally objectionable, stereotypes are a common feature of social groups, a phenomenon that has often been attributed to biased motivations or limits on the ability to process information. We argue that one reason for this continued prevalence is that pre-existing expectations about how others will behave, in the context of social coordination, can change the behaviors of one's social partners, creating the very stereotype one expected to see, even in the absence of other potential sources of stereotyping. We use a computational model of dynamic social coordination to illustrate how this "feedback loop" can emerge, engendering and entrenching stereotypic behavior, and then show that human behavior on the task generates a comparable feedback loop. Notably, people's choices on the task are not related to social dominance or system justification, suggesting biased motivations are not necessary to maintain these stereotypes.
comment: 24 pages, 6 figures
Performant, Memory Efficient and Scalable Multi-Agent Reinforcement Learning
As the field of multi-agent reinforcement learning (MARL) progresses towards larger and more complex environments, achieving strong performance while maintaining memory efficiency and scalability to many agents becomes increasingly important. Although recent research has led to several advanced algorithms, to date, none fully address all of these key properties simultaneously. In this work, we introduce Sable, a novel and theoretically sound algorithm that adapts the retention mechanism from Retentive Networks to MARL. Sable's retention-based sequence modelling architecture allows for computationally efficient scaling to a large number of agents, as well as maintaining a long temporal context, making it well-suited for large-scale partially observable environments. Through extensive evaluations across six diverse environments, we demonstrate how Sable is able to significantly outperform existing state-of-the-art methods in the majority of tasks (34 out of 45, roughly 75\%). Furthermore, Sable demonstrates stable performance as we scale the number of agents, handling environments with more than a thousand agents while exhibiting a linear increase in memory usage. Finally, we conduct ablation studies to isolate the source of Sable's performance gains and confirm its efficient computational memory usage. Our results highlight Sable's performance and efficiency, positioning it as a leading approach to MARL at scale.
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
Offline reinforcement learning (RL) has garnered significant attention for its ability to learn effective policies from pre-collected datasets without the need for further environmental interactions. While promising results have been demonstrated in single-agent settings, offline multi-agent reinforcement learning (MARL) presents additional challenges due to the large joint state-action space and the complexity of multi-agent behaviors. A key issue in offline RL is the distributional shift, which arises when the target policy being optimized deviates from the behavior policy that generated the data. This problem is exacerbated in MARL due to the interdependence between agents' local policies and the expansive joint state-action space. Prior approaches have primarily addressed this challenge by incorporating regularization in the space of either Q-functions or policies. In this work, we introduce a regularizer in the space of stationary distributions to better handle distributional shift. Our algorithm, ComaDICE, offers a principled framework for offline cooperative MARL by incorporating stationary distribution regularization for the global learning policy, complemented by a carefully structured multi-agent value decomposition strategy to facilitate multi-agent training. Through extensive experiments on the multi-agent MuJoCo and StarCraft II benchmarks, we demonstrate that ComaDICE achieves superior performance compared to state-of-the-art offline MARL methods across nearly all tasks.
Finite-time convergence to an $ε$-efficient Nash equilibrium in potential games
This paper investigates the convergence time of log-linear learning to an $\epsilon$-efficient Nash equilibrium (NE) in potential games. In such games, an efficient NE is defined as the maximizer of the potential function. Previous literature provides asymptotic convergence rates to efficient Nash equilibria, and existing finite-time rates are limited to potential games with further assumptions such as the interchangeability of players. In this paper, we prove the first finite-time convergence to an $\epsilon$-efficient NE in general potential games. Our bounds depend polynomially on $1/\epsilon$, an improvement over previous bounds that are exponential in $1/\epsilon$ and only hold for subclasses of potential games. We then strengthen our convergence result in two directions: first, we show that a variant of log-linear learning that requires a factor $A$ less feedback on the utility per round enjoys a similar convergence time; second, we demonstrate the robustness of our convergence guarantee if log-linear learning is subject to small perturbations such as alterations in the learning rule or noise-corrupted utilities.
comment: 12 main pages, 33 pages, 2 figures, 1 Table
Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration
The emergence of multi-agent reinforcement learning (MARL) is significantly transforming various fields like autonomous vehicle networks. However, real-world multi-agent systems typically contain multiple roles, and the scale of these systems dynamically fluctuates. Consequently, in order to achieve zero-shot scalable collaboration, it is essential that strategies for different roles can be updated flexibly according to the scales, which is still a challenge for current MARL frameworks. To address this, we propose a novel MARL framework named Scalable and Heterogeneous Proximal Policy Optimization (SHPPO), integrating heterogeneity into parameter-shared PPO-based MARL networks. We first leverage a latent network to learn strategy patterns for each agent adaptively. Second, we introduce a heterogeneous layer to be inserted into decision-making networks, whose parameters are specifically generated by the learned latent variables. Our approach is scalable as all the parameters are shared except for the heterogeneous layer, and gains both inter-individual and temporal heterogeneity, allowing SHPPO to adapt effectively to varying scales. SHPPO exhibits superior performance in classic MARL environments like Starcraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF), showcasing enhanced zero-shot scalability, and offering insights into the learned latent variables' impact on team performance by visualization.
Opponent Shaping for Antibody Development
Anti-viral therapies are typically designed to target only the current strains of a virus. Game theoretically, this corresponds to a short-sighted, or myopic, response. However, therapy-induced selective pressures act on viruses to drive the emergence of mutated strains, against which initial therapies have reduced efficacy. Building on a computational model of binding between antibodies and viral antigens (the Absolut! framework), we design and implement a genetic simulation of viral evolutionary escape. Crucially, this allows our antibody optimisation algorithm to consider and influence the entire escape curve of the virus, i.e. to guide (or "shape") the viral evolution. This is inspired by opponent shaping which, in general-sum learning, accounts for the adaptation of the co-player rather than playing a myopic best response. Hence we call the optimised antibodies shapers. Within our simulations, we demonstrate that our shapers target both current and simulated future viral variants, outperforming the antibodies chosen in a myopic way. Furthermore, we show that shapers exert specific evolutionary pressure on the virus compared to myopic antibodies. Altogether, shapers modify the evolutionary trajectories of viral strains and minimise the viral escape compared to their myopic counterparts. While this is a simplified model, we hope that our proposed paradigm will facilitate the discovery of better long-lived vaccines and antibody therapies in the future, enabled by rapid advancements in the capabilities of simulation tools. Our code is available at https://github.com/olakalisz/antibody-shapers.
comment: Preprint
Systems and Control (CS)
Multi-Robot Trajectory Generation via Consensus ADMM: Convex vs. Non-Convex
C-ADMM is a well-known distributed optimization framework due to its guaranteed convergence in convex optimization problems. Recently, C-ADMM has been studied in robotics applications such as multi-vehicle target tracking and collaborative manipulation tasks. However, few works have investigated the performance of C-ADMM applied to non-convex problems in robotics applications due to a lack of theoretical guarantees. For this project, we aim to quantitatively explore and examine the convergence behavior of non-convex C-ADMM through the scope of distributed multi-robot trajectory planning. We propose a convex trajectory planning problem by leveraging C-ADMM and Buffered Voronoi Cells (BVCs) to get around the non-convex collision avoidance constraint and compare this convex C-ADMM algorithm to a non-convex C-ADMM baseline with non-convex collision avoidance constraints. We show that the convex C-ADMM algorithm requires 1000 fewer iterations to achieve convergence in a multi-robot waypoint navigation scenario. We also confirm that the non-convex C-ADMM baseline leads to sub-optimal solutions and violation of safety constraints in trajectory generation.
Effects of eco-driving on energy consumption and battery degradation for electric vehicles at signalized intersections
Eco-driving has been shown to reduce energy consumption for electric vehicles (EVs). Such strategies can also be implemented to both reduce energy consumption and improve battery lifetime. This study considers the eco-driving of a connected electric vehicle equipped with vehicle-to-infrastructure (V2I) communication passing through two signalized intersections. Dynamic programming is employed to construct an eco-driving algorithm that incorporates a battery degradation model in addition to minimizing energy consumption to optimize the vehicle's speed trajectory while transiting the control zone. A parametric study is conducted for various signal timings and distances between the two intersections. It is found that eco-driving can provide up to 49\% in cost benefits over regular driving due to energy savings and improved battery life which could boost consumers' interests on EVs. This study also considered different battery capacity decay rates based on battery chemistry. Although a higher decay rate affects the optimal speed trajectories only slightly, it amplifies the benefits of eco-driving on battery life. Two battery sizes were also studied to show that the larger battery is associated with a drastically increased lifetime, thus creating opportunities for electric vehicles in other applications such as vehicle-to-grid (V2G) integration. Field tests were also conducted using a simplified rule-based version of the eco-driving algorithm implemented as a phone app which issues audio speed recommendations to the driver. The field test results were promising and validated the results from simulations. The phone app implementation is convenient and could facilitate broader adoption and widespread use of eco-driving which helps to improve transportation efficiency and protect the environment.
comment: 14 pages, 12 figures
A Microgrid Deployment Framework to Support Drayage Electrification
The electrification of heavy-duty commercial vehicles (HDCVs) is pivotal in reducing greenhouse gas emissions and urban air pollution; however, this transition poses significant challenges for the existing electric grid, which is not designed to meet the high electricity demands of HDCVs. This can lead to a less effective reduction in freight transportation's carbon intensity despite significant electrification efforts. Deploying renewable energy sources, such as photovoltaics, alongside energy storage solutions, is essential to address these challenges. This paper examines the current grid limitations and explores the critical role of microgrid deployment, integrating solar and battery energy storage systems, in supporting the electrification of HDCVs. We propose an integrated framework that is designed to enhance regional grid capacity and decrease carbon intensity by identifying viable sites where a microgrid can be deployed and provide estimates for the deployment cost. Furthermore, using this framework, we quantify the maximal impact of microgrid deployment in reducing CO2 emissions when we optimize the use of the available power. As a demonstration, we apply our framework to the region of the Port of Savannah, GA USA.
comment: 59 pages, 6 figures
Optimal Control of Fractional Punishment in Optional Public Goods Game
Punishment is probably the most frequently used mechanism to increase cooperation in Public Goods Games (PGG); however, it is expensive. To address this problem, this paper introduces an optimal control problem that uses fractional punishment to promote cooperation. We present a series of computational experiments illustrating the effects of single and combined terms of the optimization cost function. In the findings, the optimal controller outperforms the use of constant fractional punishment and gives an insight into the period and size of the penalization to be implemented with respect to the defection in the game.
Min-Time Escape of a Dubins Car from a Polygon
A turn constrained vehicle is initially located inside a polygon region and desires to escape in minimum time. First, the method of characteristics is used to describe the time-optimal strategies for reaching a line of infinite length. Next, the approach is extended to polygons constructed of a series of line segments. Using this construction technique, the min-time path to reach each edge is obtained; the resulting minimum of the set of optimal trajectories is then selected for escaping the polygon.
comment: 7 Pages, 6 Figures, Submitted to IFAC ACC, DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited; AFRL-2024-5262. This work is funded in-part by AFOSR, LRIR 24RQCOR002
AI-Native Network Digital Twin for Intelligent Network Management in 6G
As a pivotal virtualization technology, network digital twin is expected to accurately reflect real-time status and abstract features in the on-going sixth generation (6G) networks. In this article, we propose an artificial intelligence (AI)-native network digital twin framework for 6G networks to enable the synergy of AI and network digital twin, thereby facilitating intelligent network management. In the proposed framework, AI models are utilized to establish network digital twin models to facilitate network status prediction, network pattern abstraction, and network management decision-making. Furthermore, potential solutions are proposed for enhance the performance of network digital twin. Finally, a case study is presented, followed by a discussion of open research issues that are essential for AI-native network digital twin in 6G networks.
comment: This article is submitted to IEEE Wireless Communications
Detection and suppression of epileptiform seizures via model-free control and derivatives in a noisy environment SC
Recent advances in control theory yield closed-loop neurostimulations for suppressing epileptiform seizures. These advances are illustrated by computer experiments which are easy to implement and to tune. The feedback synthesis is provided by an intelligent proportional-derivative (iPD) regulator associated to model-free control. This approach has already been successfully exploited in many concrete situations in engineering, since no precise computational modeling is needed. iPDs permit tracking a large variety of signals including high-amplitude epileptic activity. Those unpredictable pathological brain oscillations should be detected in order to avoid continuous stimulation, which might induce detrimental side effects. This is achieved by introducing a data mining method based on the maxima of the recorded signals. The real-time derivative estimation in a particularly noisy epileptiform environment is made possible due to a newly developed algebraic differentiator. The virtual patient is the Wendling model, i.e., a set of ordinary differential equations adapted from the Jansen-Rit neural mass model in order to generate epileptiform activity via appropriate values of excitation- and inhibition-related parameters. Several simulations, which lead to a large variety of possible scenarios, are discussed. They show the robustness of our control synthesis with respect to different virtual patients and external disturbances.
comment: 12th International Conference on Systems and Control (ICSC), Batna (Algeria), 3-5 November 2024
WiFi-CSI Sensing and Bearing Estimation in Multi-Robot Systems: An Open-Source Simulation Framework
Development and testing of multi-robot systems employing wireless signal-based sensing requires access to suitable hardware, such as channel monitoring WiFi transceivers, which can pose significant limitations. The WiFi Sensor for Robotics (WSR) toolbox, introduced by Jadhav et al. in 2022, provides a novel solution by using WiFi Channel State Information (CSI) to compute relative bearing between robots. The toolbox leverages the amplitude and phase of WiFi signals and creates virtual antenna arrays by exploiting the motion of mobile robots, eliminating the need for physical antenna arrays. However, the WSR toolbox's reliance on an obsoleting WiFi transceiver hardware has limited its operability and accessibility, hindering broader application and development of relevant tools. We present an open-source simulation framework that replicates the WSR toolbox's capabilities using Gazebo and Matlab. By simulating WiFi-CSI data collection, our framework emulates the behavior of mobile robots equipped with the WSR toolbox, enabling precise bearing estimation without physical hardware. We validate the framework through experiments with both simulated and real Turtlebot3 robots, showing a close match between the obtained CSI data and the resulting bearing estimates. This work provides a virtual environment for developing and testing WiFi-CSI-based multi-robot localization without relying on physical hardware. All code and experimental setup information are publicly available at https://github.com/BrendanxP/CSI-Simulation-Framework
comment: 6+1 pages (text + references), 6 figures
Single versus Multi-Tone Wireless Power Transfer with Physically Large Array
Distributed beamforming is a key enabler to provide power wirelessly to a massive amount of energy-neutral devices (ENDs). However, without prior information and fully depleted ENDs, initially powering these devices efficiently is an open question. This work investigates and assesses the feasibility of harvesting sufficient energy to transmit a backscatter pilot signal from the END, which can be then used for coherent downlink transmission. We experimentally evaluated adaptive single-tone and multi-tone signals during initial charging. The results indicate that the response time for ENDs with unknown locations can extend to several tens of seconds. Notably, the adaptive single-tone excitation shows, among others, better performance at lower transmit power levels, providing a faster response. These findings underscore the potential of adaptive single-tone signals in optimizing power delivery to END in future networks.
comment: 1st International Workshop on Energy Neutral and Sustainable IoT Devices and Infrastructure (EN-IoT 2024)
A Control Barrier Function Candidate for Limited Field of View Sensors
The problem of control based on vision measurements (bearings) has been amply studied in the literature; however, the problem of addressing the limits of the field of view of physical sensors has received relatively less attention (especially for agents with non-trivial dynamics). The technical challenge is that, as in most vision-based control approaches, a standard approach to the problem requires knowing the distance between cameras and observed features in the scene, which is not directly available. Instead, we present a solution based on a Control Barrier Function (CBF) approach that uses a splitting of the original differential constraint to effectively remove the dependence on the unknown measurement error. Compared to the current literature, our approach gives strong robustness guarantees against bounded distance estimation errors. We showcase the proposed solution with the numerical simulations of a double integrator and a quadrotor tracking a trajectory while keeping the corners of a rectangular gate in the camera field of view.
comment: 8 pages, conference paper
Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices
3D object detection with omnidirectional views enables safety-critical applications such as mobile robot navigation. Such applications increasingly operate on resource-constrained edge devices, facilitating reliable processing without privacy concerns or network delays. To enable cost-effective deployment, cameras have been widely adopted as a low-cost alternative to LiDAR sensors. However, the compute-intensive workload to achieve high performance of camera-based solutions remains challenging due to the computational limitations of edge devices. In this paper, we present Panopticus, a carefully designed system for omnidirectional and camera-based 3D detection on edge devices. Panopticus employs an adaptive multi-branch detection scheme that accounts for spatial complexities. To optimize the accuracy within latency limits, Panopticus dynamically adjusts the model's architecture and operations based on available edge resources and spatial characteristics. We implemented Panopticus on three edge devices and conducted experiments across real-world environments based on the public self-driving dataset and our mobile 360{\deg} camera dataset. Experiment results showed that Panopticus improves accuracy by 62% on average given the strict latency objective of 33ms. Also, Panopticus achieves a 2.1{\times} latency reduction on average compared to baselines.
comment: Published at MobiCom 2024
Barycentric rational approximation for learning the index of a dynamical system from limited data
We consider the task of data-driven identification of dynamical systems, specifically for systems whose behavior at large frequencies is non-standard, as encoded by a non-trivial relative degree of the transfer function or, alternatively, a non-trivial index of a corresponding realization as a descriptor system. We develop novel surrogate modeling strategies that allow state-of-the-art rational approximation algorithms (e.g., AAA and vector fitting) to better handle data coming from such systems with non-trivial relative degree. Our contribution is twofold. On one hand, we describe a strategy to build rational surrogate models with prescribed relative degree, with the objective of mirroring the high-frequency behavior of the high-fidelity problem, when known. The surrogate model's desired degree is achieved through constraints on its barycentric coefficients, rather than through ad-hoc modifications of the rational form. On the other hand, we present a degree-identification routine that allows one to estimate the unknown relative degree of a system from low-frequency data. By identifying the degree of the system that generated the data, we can build a surrogate model that, in addition to matching the data well (at low frequencies), has enhanced extrapolation capabilities (at high frequencies). We showcase the effectiveness and robustness of the newly proposed method through a suite of numerical tests.
comment: 20 pages, 5 figures
An Analysis of Market-to-Market Coordination
The growing usage of renewable energy resources has introduced significant uncertainties in energy generation, enlarging challenges for Regional Transmission Operators (RTOs) in managing transmission congestion. To mitigate congestion that affects neighboring regions, RTOs employ a market-to-market (M2M) process through an iterative method, in which they exchange real-time security-constrained economic dispatch solutions and communicate requests for congestion relief. While this method provides economic benefits, it struggles with issues like power swings and time delays. To explore the full potential of M2M enhancements, in this paper, we first analyze the current M2M iterative method practice to better understand its efficacy and identify places for improvements. Then, we explore enhancements and develop an ADMM method for the M2M coordination that optimizes congestion management. Specifically, our ADMM method can achieve a minimal cost that is the same as the cost obtained through a centralized model that optimizes multiple markets altogether. Our final case studies, across a comprehensive set of multi-area benchmark instances, demonstrate the superior performance of the proposed ADMM algorithm for the M2M process. Meanwhile, we identify scenarios where the existing M2M process fails to provide solutions as a by-product. Finally, the algorithm is implemented in an open-source package UnitCommitment.jl for easy access by a broader audience.
comment: 9 pages, 4 figures
A Preventive-Corrective Scheme for Ensuring Power System Security During Active Wildfire Risks
The focus of this paper is on operating the electric power grid in a secure manner when wildfire risks are high. This is a challenging problem because of the uncertain ways in which the fires can impact the operation of the power system. To address this challenge, we propose a novel preventive-corrective coordinated decision-making scheme that quickly mitigates both static and dynamic insecurities given the risk of active wildfires in a region. The scheme utilizes a comprehensive contingency analysis tool for multi-asset outages that leverages: (i) a Feasibility Test algorithm which exhaustively desaturates overloaded cut-sets to prevent cascading line outages, and (ii) a data-driven transient stability analyzer which alleviates dynamic instabilities. This tool is then used to operate a coordinated unit commitment/optimal power flow model that is designed to adapt to varying risk levels associated with wildfires. Depending on the allowed risk, the model balances economical operation and grid robustness. The results obtained using the IEEE 118-bus system indicate that the proposed approach alleviates system vulnerabilities to wildfires while also minimizing operational cost.
comment: Submitted to the Open Access Journal of Power and Energy (OAJPE)
Aerial-based Crisis Management Center (ACMC)
Crisis management (CM) for critical infrastructures, natural disasters such as wildfires and hurricanes, terrorist actions, or civil unrest requires high speed communications and connectivity, and access to high performance computational resources to deliver timely dynamic responses to the crisis being managed by different first responders. CM systems should detect, recognize, and disseminate huge amounts of heterogeneous dynamic events that operate at different speeds and formats. Furthermore, the processing of crisis events and the development of real-time responses are major research challenges when the communications and computational resources needed by CM stakeholders are not available or severely degraded by the crisis. The main goal of the research presented in this paper is to utilize Unmanned Autonomous Systems (UAS) to provide Aerial-based Crisis Management Center (ACMC) that will provide the required communications services and the computational resources that are critically needed by first responders. In our approach to develop an ACMC architecture, we utilize a set of flexible Unmanned Aerial Systems (UAS) that can be dynamically composed to meet the communications and computational requirements of CM tasks. The ACMC services will be modeled as a deep neural network (DNN) mass transport approach to cover a distributed target in a decentralized manner. This is indeed a new decentralized coverage approach with time-varying communication weights. Furthermore, our analysis proves the stability and convergence of the proposed DNN-based mass transport for a team of UAS (e.g., quadcopters), where each quadcopter uses a feedback nonlinear control to independently attain the intended coverage trajectory in a decentralized manner.
Adaptive Invariant Extended Kalman Filter with Noise Covariance Tuning for Attitude Estimation
Attitude estimation is crucial in aerospace engineering, robotics, and virtual reality applications, but faces difficulties due to nonlinear system dynamics and sensor limitations. This paper addresses the challenge of attitude estimation using quaternion-based adaptive right invariant extended Kalman filtering (RI-EKF) that integrates data from inertial and magnetometer sensors. Our approach applies the expectation-maximization (EM) algorithm to estimate noise covariance, exploiting RI-EKF symmetry properties. We analyze the adaptive RI-EKF's stability, convergence, and accuracy, validating its performance through simulations and comparison with the left invariant EKF. Monte Carlo simulations validate the effectiveness of our noise covariance estimation technique across various window lengths.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Equality Constrained Diffusion for Direct Trajectory Optimization
The recent success of diffusion-based generative models in image and natural language processing has ignited interest in diffusion-based trajectory optimization for nonlinear control systems. Existing methods cannot, however, handle the nonlinear equality constraints necessary for direct trajectory optimization. As a result, diffusion-based trajectory optimizers are currently limited to shooting methods, where the nonlinear dynamics are enforced by forward rollouts. This precludes many of the benefits enjoyed by direct methods, including flexible state constraints, reduced numerical sensitivity, and easy initial guess specification. In this paper, we present a method for diffusion-based optimization with equality constraints. This allows us to perform direct trajectory optimization, enforcing dynamic feasibility with constraints rather than rollouts. To the best of our knowledge, this is the first diffusion-based optimization algorithm that supports the general nonlinear equality constraints required for direct trajectory optimization.
Latency Reduction in CloudVR: Cloud Prediction, Edge Correction
Current virtual reality (VR) headsets encounter a trade-off between high processing power and affordability. Consequently, offloading 3D rendering to remote servers helps reduce costs, battery usage, and headset weight. Maintaining network latency below 20ms is crucial to achieving this goal. Predicting future movement and prerendering are beneficial in meeting this tight latency bound. This paper proposes a method that utilizes the low-latency property of edge servers and the high resources available in cloud servers simultaneously to achieve cost-efficient, high-quality VR. In this method, head movement is predicted on the cloud server, and frames are rendered there and transmitted to the edge server. If the prediction error surpasses a threshold, the frame is re-rendered on the edge server. Results demonstrate that using this method, each edge server can efficiently serve up to 23 users concurrently, compared to a maximum of 5 users when rendering the frame entirely on the edge server. Furthermore, this paper shows that employing the Mean Absolute Error loss function and predicting acceleration rather than velocity significantly enhances prediction accuracy. Additionally, it is shown that normalizing individual data using its mean and standard deviation does not yield improvements in prediction accuracy. These findings provide insights into optimizing VR headset performance through cloud-edge collaboration.
comment: Virtual Reality, Edge Computing, Distributed Rendering, Prediction
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment EMNLP 2024
Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. To navigate this challenge, we argue the prominence of grounding LLMs with evident preferences. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives, thereby guiding the model to generate responses that meet the requirements. Our experimental analysis reveals that the aligned models can provide responses that match various preferences among the "3H" (helpfulness, honesty, harmlessness) desiderata. Furthermore, by introducing diverse data and alignment goals, we surpass baseline methods in aligning with single objectives, hence mitigating the impact of the alignment tax and achieving Pareto improvements in multi-objective alignment.
comment: EMNLP 2024 main conference
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Surrogate models are used to predict the behavior of complex energy systems that are too expensive to simulate with traditional numerical methods. Our work introduces the use of language descriptions, which we call "system captions" or SysCaps, to interface with such surrogates. We argue that interacting with surrogates through text, particularly natural language, makes these models more accessible for both experts and non-experts. We introduce a lightweight multimodal text and timeseries regression model and a training pipeline that uses large language models (LLMs) to synthesize high-quality captions from simulation metadata. Our experiments on two real-world simulators of buildings and wind farms show that our SysCaps-augmented surrogates have better accuracy on held-out systems than traditional methods while enjoying new generalization abilities, such as handling semantically related descriptions of the same test system. Additional experiments also highlight the potential of SysCaps to unlock language-driven design space exploration and to regularize training through prompt augmentation.
comment: 21 pages. Under review
Loss of Control Prevention of an Agile Aircraft: Dynamic Command Saturation Approach
The prevention of the loss of control in agile aircraft during the extreme maneuvers is of concern due to the nonlinear aerodynamics and flight dynamics nature of the aircraft in this study. Within this context, the primary objective is to present an architectural framework and elucidate the methodology for its determination. This architecture enables agile maneuvering aircraft to execute more extreme maneuvers while avoiding departure from stable flight, surpassing maneuverability capabilities of conventional state limiters. Hence, the notion of an incremental attainable moment set is introduced for an instantaneous controllability investigation using demanded control moment coefficients derived in the high-level controller, which is the incremental nonlinear dynamic inversion. In the event of detecting a violation of controllability boundaries, Lyapunov-based dynamic command saturation is employed to limit pilot commands, preventing the aircraft from initiating departure from stable flight. As a result, abrupt and excessive pilot inputs are dynamically softened in-flight, and presumable departure tendencies are mitigated. Consequently, the superiority of the proposed method over conventional state limiters is proven through the flight simulations of agile and abrupt maneuvers, as well as Monte Carlo simulations that demonstrate the expansion of stable maneuverable volumes up to 55%.
Safe and Stable Formation Control with Distributed Multi-Agents Using Adaptive Control and Control Barrier Functions
This manuscript considers the problem of ensuring stability and safety during formation control with distributed multi-agent systems in the presence of parametric uncertainty in the dynamics and limited communication. We propose an integrative approach that combines Control Barrier Functions, Adaptive Control, and connected graphs. A reference model is designed so as to ensure a safe and stable formation control strategy. This is combined with a provably correct adaptive control design that includes the use of a CBF-based safety filter that suitably generates safe reference commands. Numerical examples are provided to support the theoretical derivations.
comment: Under Review - American Control Conference 2025
Co-investment with Payoff Sharing Benefit Operators and Users in Network Design
Network-based complex systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users. In this paper, we consider the question of what cooperative mechanisms can benefit both operators and users simultaneously. We address this question in a game theoretical setting, integrating both non-cooperative and cooperative game theory. During the non-cooperative stage, subnetwork decision-makers strategically design their local networks. In the cooperative stage, the co-investment mechanism and the payoff-sharing mechanism are developed to enlarge collective benefits and fairly distribute them. A case study of the Sioux Falls network is conducted to demonstrate the efficiency of the proposed framework. The impact of this interactive network design on environmental sustainability, social welfare and economic efficiency is evaluated, along with an examination of scenarios involving regions with heterogeneous characteristics.
comment: 8 pages, 6 figures
Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration
The emergence of multi-agent reinforcement learning (MARL) is significantly transforming various fields like autonomous vehicle networks. However, real-world multi-agent systems typically contain multiple roles, and the scale of these systems dynamically fluctuates. Consequently, in order to achieve zero-shot scalable collaboration, it is essential that strategies for different roles can be updated flexibly according to the scales, which is still a challenge for current MARL frameworks. To address this, we propose a novel MARL framework named Scalable and Heterogeneous Proximal Policy Optimization (SHPPO), integrating heterogeneity into parameter-shared PPO-based MARL networks. We first leverage a latent network to learn strategy patterns for each agent adaptively. Second, we introduce a heterogeneous layer to be inserted into decision-making networks, whose parameters are specifically generated by the learned latent variables. Our approach is scalable as all the parameters are shared except for the heterogeneous layer, and gains both inter-individual and temporal heterogeneity, allowing SHPPO to adapt effectively to varying scales. SHPPO exhibits superior performance in classic MARL environments like Starcraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF), showcasing enhanced zero-shot scalability, and offering insights into the learned latent variables' impact on team performance by visualization.
Sensory Glove-Based Surgical Robot User Interface ICRA
Robotic surgery has reached a high level of maturity and has become an integral part of standard surgical care. However, existing surgeon consoles are bulky, take up valuable space in the operating room, make surgical team coordination challenging, and their proprietary nature makes it difficult to take advantage of recent technological advances, especially in virtual and augmented reality. One potential area for further improvement is the integration of modern sensory gloves into robotic platforms, allowing surgeons to control robotic arms intuitively with their hand movements. We propose one such system that combines an HTC Vive tracker, a Manus Meta Prime 3 XR sensory glove, and SCOPEYE wireless smart glasses. The system controls one arm of a da Vinci surgical robot. In addition to moving the arm, the surgeon can use fingers to control the end-effector of the surgical instrument. Hand gestures are used to implement clutching and similar functions. In particular, we introduce clutching of the instrument orientation, a functionality unavailable in the da Vinci system. The vibrotactile elements of the glove are used to provide feedback to the user when gesture commands are invoked. A qualitative and quantitative evaluation has been conducted that compares the current device with the dVRK console. The system is shown to have excellent tracking accuracy, and the new interface allows surgeons to perform common surgical training tasks with minimal practice efficiently.
comment: 6 pages, 4 figures, 7 tables, submitted to International Conference on Robotics and Automation (ICRA) 2025
Closed-loop Diffusion Control of Complex Physical Systems
The control problems of complex physical systems have broad applications in science and engineering. Previous studies have shown that generative control methods based on diffusion models offer significant advantages for solving these problems. However, existing generative control approaches face challenges in both performance and efficiency when extended to the closed-loop setting, which is essential for effective control. In this paper, we propose an efficient Closed-Loop Diffusion method for Physical systems Control (CL-DiffPhyCon). By employing an asynchronous denoising framework for different physical time steps, CL-DiffPhyCon generates control signals conditioned on real-time feedback from the environment with significantly reduced computational cost during sampling. Additionally, the control process could be further accelerated by incorporating fast sampling techniques, such as DDIM. We evaluate CL-DiffPhyCon on two tasks: 1D Burgers' equation control and 2D incompressible fluid control. The results demonstrate that CL-DiffPhyCon achieves superior control performance with significant improvements in sampling efficiency.
Auction designs to increase incentive compatibility and reduce self-scheduling in electricity markets
The system operator's scheduling problem in electricity markets, called unit commitment, is a non-convex mixed-integer program. The optimal value function is non-convex, preventing the application of traditional marginal pricing theory to find prices that clear the market and incentivize market participants to follow the dispatch schedule. Units that perceive the opportunity to make a profit may be incentivized to self-commit (submitting an offer with zero fixed operating costs) or self-schedule their production (submitting an offer with zero total cost). We simulate bidder behavior to show that market power can be exercised by self-committing/scheduling. Agents can learn to increase their profits via a reinforcement learning algorithm without explicit knowledge of the costs or strategies of other agents. We investigate different non-convex pricing models over a multi-period commitment window simulating the day-ahead market and show that convex hull pricing can reduce producer incentives to deviate from the central dispatch decision. In a realistic test system with approximately 1000 generators, we find strategic bidding under the restricted convex model can increase total producer profits by 4.4\% and decrease lost opportunity costs by 2/3. While the cost to consumers with convex hull pricing is higher at the competitive solution, the cost to consumers is higher with the restricted convex model after strategic bidding.
comment: Updated author affiliation
On the Sum Secrecy Rate Maximisation for Wireless Vehicular Networks
Wireless communications form the backbone of future vehicular networks, playing a critical role in applications ranging from traffic control to vehicular road safety. However, the dynamic structure of these networks creates security vulnerabilities, making security considerations an integral part of network design. We address these security concerns from a physical layer security aspect by investigating achievable secrecy rates in wireless vehicular networks. Specifically, we aim to maximize the sum secrecy rate from all vehicular pairs subject to bandwidth and power resource constraints. For the considered problem, we first propose a solution based on the successive convex approximation (SCA) method, which has not been applied in this context before. To further reduce the complexity of the SCA-based method, we also propose a low-complexity solution based on a fast iterative shrinkage-thresholding algorithm (FISTA). Our simulation results for SCA and FISTA show a trade-off between convergence and runtime. While the SCA method achieves better convergence, the FISTA-based approach is at least 300 times faster than the SCA method.
Generalized Lyapunov conditions for k-contraction: analysis and feedback design
Recently, the concept of k-contraction has been introduced as a promising generalization of contraction for dynamical systems. However, the study of k-contraction properties has faced significant challenges due to the reliance on complex mathematical objects called matrix compounds. As a result, related control design methodologies have yet to appear in the literature. In this paper, we overcome existing limitations and propose new sufficient conditions for k-contraction which do not require matrix compounds computation. Notably, these conditions are also necessary in the linear time-invariant framework. Leveraging on these findings, we propose a feedback design methodology for both the linear and the nonlinear scenarios which can be used to enforce k-contractivity properties on the closed-loop dynamics.
On Continuous Full-Order Integral-Terminal Sliding Mode Control with Unknown A Priori Bound on Uncertainty
This study aims at providing a solution to the problem of designing a continuous and finite-time control for a class of nonlinear systems in the presence of matched uncertainty with an unknown apriori bound. First, we propose a Full-Order Integral-Terminal Sliding Manifold (FOITSM) with a conventional (discontinuous) sliding mode to show that it provides the combined attributes of the nonsingular terminal and integral sliding mode algorithms. Secondly, an Adaptive Disturbance Observer (ADO) has been designed to alleviate the effect of the uncertainty acting on the system. On application of the ADO-based Full-Order Integral-Terminal Sliding Mode Control (FOITSMC), the chattering phenomenon in control input has been reduced substantially in the presence of conditionally known matched disturbances. Moreover, the adaptive gains of ADO are updated non-monotonically without over-bounding the acting disturbance, yet sustain the global boundedness of state trajectories within a specific bound. %Finally, an application of the proposed algorithm for attitude stabilization of a rigid spacecraft has been successively shown.
comment: 26 pages, 5 figures
A Parallel-in-Time Newton's Method for Nonlinear Model Predictive Control
Model predictive control (MPC) is a powerful framework for optimal control of dynamical systems. However, MPC solvers suffer from a high computational burden that restricts their application to systems with low sampling frequency. This issue is further amplified in nonlinear and constrained systems that require nesting MPC solvers within iterative procedures. In this paper, we address these issues by developing parallel-in-time algorithms for constrained nonlinear optimization problems that take advantage of massively parallel hardware to achieve logarithmic computational time scaling over the planning horizon. We develop time-parallel second-order solvers based on interior point methods and the alternating direction method of multipliers, leveraging fast convergence and lower computational cost per iteration. The parallelization is based on a reformulation of the subproblems in terms of associative operations that can be parallelized using the associative scan algorithm. We validate our approach on numerical examples of nonlinear and constrained dynamical systems.
CaΣoS: A nonlinear sum-of-squares optimization suite
We present Ca$\Sigma$oS, the first MATLAB software specifically designed for nonlinear sum-of-squares optimization. A symbolic polynomial algebra system allows to formulate parametrized sum-of-squares optimization problems and facilitates their fast, repeated evaluations. To that extent, we make use of CasADi's symbolic framework and realize concepts of monomial sparsity, linear operators (including duals), and functions between polynomials. Ca$\Sigma$oS currently provides interfaces to the conic solvers SeDuMi, Mosek, and SCS as well as methods to solve quasiconvex optimization problems (via bisection) and nonconvex optimization problems (via sequential convexification). Numerical examples for benchmark problems including region-of-attraction and reachable set estimation for nonlinear dynamic systems demonstrate significant improvements in computation time compared to existing toolboxes. Ca$\Sigma$oS is available open-source at https://github.com/ifr-acso/casos.
comment: Submitted to 2025 American Control Conference
Robust Data-EnablEd Predictive Leading Cruise Control via Reachability Analysis
Data-driven predictive control promises model-free wave-dampening strategies for Connected and Autonomous Vehicles (CAVs) in mixed traffic flow. However, its performance relies on data quality, which suffers from unknown noise and disturbances. This paper introduces a Robust Data-EnablEd Predictive Leading Cruise Control (RDeeP-LCC) method based on reachability analysis, aiming to achieve safe and optimal CAV control under bounded process noise and external disturbances. Precisely, the matrix zonotope set technique and Willems' Fundamental Lemma are employed to derive the over-approximated system dynamics directly from data, and a data-driven feedback control technique is utilized to obtain an additional feedback input for stability. We decouple the mixed platoon into an error system and a nominal system, where the error system provides data-driven reachability sets for the enhanced safety constraints in the nominal system. Finally, a data-driven predictive control framework is formulated in a tube-based control manner for robustness guarantees. Nonlinear simulations with noise-corrupted data demonstrate that the proposed method outperforms baseline methods in mitigating traffic waves.
comment: 8 pages, 4 figures
Systems and Control (EESS)
Multi-Robot Trajectory Generation via Consensus ADMM: Convex vs. Non-Convex
C-ADMM is a well-known distributed optimization framework due to its guaranteed convergence in convex optimization problems. Recently, C-ADMM has been studied in robotics applications such as multi-vehicle target tracking and collaborative manipulation tasks. However, few works have investigated the performance of C-ADMM applied to non-convex problems in robotics applications due to a lack of theoretical guarantees. For this project, we aim to quantitatively explore and examine the convergence behavior of non-convex C-ADMM through the scope of distributed multi-robot trajectory planning. We propose a convex trajectory planning problem by leveraging C-ADMM and Buffered Voronoi Cells (BVCs) to get around the non-convex collision avoidance constraint and compare this convex C-ADMM algorithm to a non-convex C-ADMM baseline with non-convex collision avoidance constraints. We show that the convex C-ADMM algorithm requires 1000 fewer iterations to achieve convergence in a multi-robot waypoint navigation scenario. We also confirm that the non-convex C-ADMM baseline leads to sub-optimal solutions and violation of safety constraints in trajectory generation.
Effects of eco-driving on energy consumption and battery degradation for electric vehicles at signalized intersections
Eco-driving has been shown to reduce energy consumption for electric vehicles (EVs). Such strategies can also be implemented to both reduce energy consumption and improve battery lifetime. This study considers the eco-driving of a connected electric vehicle equipped with vehicle-to-infrastructure (V2I) communication passing through two signalized intersections. Dynamic programming is employed to construct an eco-driving algorithm that incorporates a battery degradation model in addition to minimizing energy consumption to optimize the vehicle's speed trajectory while transiting the control zone. A parametric study is conducted for various signal timings and distances between the two intersections. It is found that eco-driving can provide up to 49\% in cost benefits over regular driving due to energy savings and improved battery life which could boost consumers' interests on EVs. This study also considered different battery capacity decay rates based on battery chemistry. Although a higher decay rate affects the optimal speed trajectories only slightly, it amplifies the benefits of eco-driving on battery life. Two battery sizes were also studied to show that the larger battery is associated with a drastically increased lifetime, thus creating opportunities for electric vehicles in other applications such as vehicle-to-grid (V2G) integration. Field tests were also conducted using a simplified rule-based version of the eco-driving algorithm implemented as a phone app which issues audio speed recommendations to the driver. The field test results were promising and validated the results from simulations. The phone app implementation is convenient and could facilitate broader adoption and widespread use of eco-driving which helps to improve transportation efficiency and protect the environment.
comment: 14 pages, 12 figures
A Microgrid Deployment Framework to Support Drayage Electrification
The electrification of heavy-duty commercial vehicles (HDCVs) is pivotal in reducing greenhouse gas emissions and urban air pollution; however, this transition poses significant challenges for the existing electric grid, which is not designed to meet the high electricity demands of HDCVs. This can lead to a less effective reduction in freight transportation's carbon intensity despite significant electrification efforts. Deploying renewable energy sources, such as photovoltaics, alongside energy storage solutions, is essential to address these challenges. This paper examines the current grid limitations and explores the critical role of microgrid deployment, integrating solar and battery energy storage systems, in supporting the electrification of HDCVs. We propose an integrated framework that is designed to enhance regional grid capacity and decrease carbon intensity by identifying viable sites where a microgrid can be deployed and provide estimates for the deployment cost. Furthermore, using this framework, we quantify the maximal impact of microgrid deployment in reducing CO2 emissions when we optimize the use of the available power. As a demonstration, we apply our framework to the region of the Port of Savannah, GA USA.
comment: 59 pages, 6 figures
Optimal Control of Fractional Punishment in Optional Public Goods Game
Punishment is probably the most frequently used mechanism to increase cooperation in Public Goods Games (PGG); however, it is expensive. To address this problem, this paper introduces an optimal control problem that uses fractional punishment to promote cooperation. We present a series of computational experiments illustrating the effects of single and combined terms of the optimization cost function. In the findings, the optimal controller outperforms the use of constant fractional punishment and gives an insight into the period and size of the penalization to be implemented with respect to the defection in the game.
Min-Time Escape of a Dubins Car from a Polygon
A turn constrained vehicle is initially located inside a polygon region and desires to escape in minimum time. First, the method of characteristics is used to describe the time-optimal strategies for reaching a line of infinite length. Next, the approach is extended to polygons constructed of a series of line segments. Using this construction technique, the min-time path to reach each edge is obtained; the resulting minimum of the set of optimal trajectories is then selected for escaping the polygon.
comment: 7 Pages, 6 Figures, Submitted to IFAC ACC, DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited; AFRL-2024-5262. This work is funded in-part by AFOSR, LRIR 24RQCOR002
AI-Native Network Digital Twin for Intelligent Network Management in 6G
As a pivotal virtualization technology, network digital twin is expected to accurately reflect real-time status and abstract features in the on-going sixth generation (6G) networks. In this article, we propose an artificial intelligence (AI)-native network digital twin framework for 6G networks to enable the synergy of AI and network digital twin, thereby facilitating intelligent network management. In the proposed framework, AI models are utilized to establish network digital twin models to facilitate network status prediction, network pattern abstraction, and network management decision-making. Furthermore, potential solutions are proposed for enhance the performance of network digital twin. Finally, a case study is presented, followed by a discussion of open research issues that are essential for AI-native network digital twin in 6G networks.
comment: This article is submitted to IEEE Wireless Communications
Detection and suppression of epileptiform seizures via model-free control and derivatives in a noisy environment SC
Recent advances in control theory yield closed-loop neurostimulations for suppressing epileptiform seizures. These advances are illustrated by computer experiments which are easy to implement and to tune. The feedback synthesis is provided by an intelligent proportional-derivative (iPD) regulator associated to model-free control. This approach has already been successfully exploited in many concrete situations in engineering, since no precise computational modeling is needed. iPDs permit tracking a large variety of signals including high-amplitude epileptic activity. Those unpredictable pathological brain oscillations should be detected in order to avoid continuous stimulation, which might induce detrimental side effects. This is achieved by introducing a data mining method based on the maxima of the recorded signals. The real-time derivative estimation in a particularly noisy epileptiform environment is made possible due to a newly developed algebraic differentiator. The virtual patient is the Wendling model, i.e., a set of ordinary differential equations adapted from the Jansen-Rit neural mass model in order to generate epileptiform activity via appropriate values of excitation- and inhibition-related parameters. Several simulations, which lead to a large variety of possible scenarios, are discussed. They show the robustness of our control synthesis with respect to different virtual patients and external disturbances.
comment: 12th International Conference on Systems and Control (ICSC), Batna (Algeria), 3-5 November 2024
WiFi-CSI Sensing and Bearing Estimation in Multi-Robot Systems: An Open-Source Simulation Framework
Development and testing of multi-robot systems employing wireless signal-based sensing requires access to suitable hardware, such as channel monitoring WiFi transceivers, which can pose significant limitations. The WiFi Sensor for Robotics (WSR) toolbox, introduced by Jadhav et al. in 2022, provides a novel solution by using WiFi Channel State Information (CSI) to compute relative bearing between robots. The toolbox leverages the amplitude and phase of WiFi signals and creates virtual antenna arrays by exploiting the motion of mobile robots, eliminating the need for physical antenna arrays. However, the WSR toolbox's reliance on an obsoleting WiFi transceiver hardware has limited its operability and accessibility, hindering broader application and development of relevant tools. We present an open-source simulation framework that replicates the WSR toolbox's capabilities using Gazebo and Matlab. By simulating WiFi-CSI data collection, our framework emulates the behavior of mobile robots equipped with the WSR toolbox, enabling precise bearing estimation without physical hardware. We validate the framework through experiments with both simulated and real Turtlebot3 robots, showing a close match between the obtained CSI data and the resulting bearing estimates. This work provides a virtual environment for developing and testing WiFi-CSI-based multi-robot localization without relying on physical hardware. All code and experimental setup information are publicly available at https://github.com/BrendanxP/CSI-Simulation-Framework
comment: 6+1 pages (text + references), 6 figures
Single versus Multi-Tone Wireless Power Transfer with Physically Large Array
Distributed beamforming is a key enabler to provide power wirelessly to a massive amount of energy-neutral devices (ENDs). However, without prior information and fully depleted ENDs, initially powering these devices efficiently is an open question. This work investigates and assesses the feasibility of harvesting sufficient energy to transmit a backscatter pilot signal from the END, which can be then used for coherent downlink transmission. We experimentally evaluated adaptive single-tone and multi-tone signals during initial charging. The results indicate that the response time for ENDs with unknown locations can extend to several tens of seconds. Notably, the adaptive single-tone excitation shows, among others, better performance at lower transmit power levels, providing a faster response. These findings underscore the potential of adaptive single-tone signals in optimizing power delivery to END in future networks.
comment: 1st International Workshop on Energy Neutral and Sustainable IoT Devices and Infrastructure (EN-IoT 2024)
A Control Barrier Function Candidate for Limited Field of View Sensors
The problem of control based on vision measurements (bearings) has been amply studied in the literature; however, the problem of addressing the limits of the field of view of physical sensors has received relatively less attention (especially for agents with non-trivial dynamics). The technical challenge is that, as in most vision-based control approaches, a standard approach to the problem requires knowing the distance between cameras and observed features in the scene, which is not directly available. Instead, we present a solution based on a Control Barrier Function (CBF) approach that uses a splitting of the original differential constraint to effectively remove the dependence on the unknown measurement error. Compared to the current literature, our approach gives strong robustness guarantees against bounded distance estimation errors. We showcase the proposed solution with the numerical simulations of a double integrator and a quadrotor tracking a trajectory while keeping the corners of a rectangular gate in the camera field of view.
comment: 8 pages, conference paper
Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices
3D object detection with omnidirectional views enables safety-critical applications such as mobile robot navigation. Such applications increasingly operate on resource-constrained edge devices, facilitating reliable processing without privacy concerns or network delays. To enable cost-effective deployment, cameras have been widely adopted as a low-cost alternative to LiDAR sensors. However, the compute-intensive workload to achieve high performance of camera-based solutions remains challenging due to the computational limitations of edge devices. In this paper, we present Panopticus, a carefully designed system for omnidirectional and camera-based 3D detection on edge devices. Panopticus employs an adaptive multi-branch detection scheme that accounts for spatial complexities. To optimize the accuracy within latency limits, Panopticus dynamically adjusts the model's architecture and operations based on available edge resources and spatial characteristics. We implemented Panopticus on three edge devices and conducted experiments across real-world environments based on the public self-driving dataset and our mobile 360{\deg} camera dataset. Experiment results showed that Panopticus improves accuracy by 62% on average given the strict latency objective of 33ms. Also, Panopticus achieves a 2.1{\times} latency reduction on average compared to baselines.
comment: Published at MobiCom 2024
Barycentric rational approximation for learning the index of a dynamical system from limited data
We consider the task of data-driven identification of dynamical systems, specifically for systems whose behavior at large frequencies is non-standard, as encoded by a non-trivial relative degree of the transfer function or, alternatively, a non-trivial index of a corresponding realization as a descriptor system. We develop novel surrogate modeling strategies that allow state-of-the-art rational approximation algorithms (e.g., AAA and vector fitting) to better handle data coming from such systems with non-trivial relative degree. Our contribution is twofold. On one hand, we describe a strategy to build rational surrogate models with prescribed relative degree, with the objective of mirroring the high-frequency behavior of the high-fidelity problem, when known. The surrogate model's desired degree is achieved through constraints on its barycentric coefficients, rather than through ad-hoc modifications of the rational form. On the other hand, we present a degree-identification routine that allows one to estimate the unknown relative degree of a system from low-frequency data. By identifying the degree of the system that generated the data, we can build a surrogate model that, in addition to matching the data well (at low frequencies), has enhanced extrapolation capabilities (at high frequencies). We showcase the effectiveness and robustness of the newly proposed method through a suite of numerical tests.
comment: 20 pages, 5 figures
An Analysis of Market-to-Market Coordination
The growing usage of renewable energy resources has introduced significant uncertainties in energy generation, enlarging challenges for Regional Transmission Operators (RTOs) in managing transmission congestion. To mitigate congestion that affects neighboring regions, RTOs employ a market-to-market (M2M) process through an iterative method, in which they exchange real-time security-constrained economic dispatch solutions and communicate requests for congestion relief. While this method provides economic benefits, it struggles with issues like power swings and time delays. To explore the full potential of M2M enhancements, in this paper, we first analyze the current M2M iterative method practice to better understand its efficacy and identify places for improvements. Then, we explore enhancements and develop an ADMM method for the M2M coordination that optimizes congestion management. Specifically, our ADMM method can achieve a minimal cost that is the same as the cost obtained through a centralized model that optimizes multiple markets altogether. Our final case studies, across a comprehensive set of multi-area benchmark instances, demonstrate the superior performance of the proposed ADMM algorithm for the M2M process. Meanwhile, we identify scenarios where the existing M2M process fails to provide solutions as a by-product. Finally, the algorithm is implemented in an open-source package UnitCommitment.jl for easy access by a broader audience.
comment: 9 pages, 4 figures
A Preventive-Corrective Scheme for Ensuring Power System Security During Active Wildfire Risks
The focus of this paper is on operating the electric power grid in a secure manner when wildfire risks are high. This is a challenging problem because of the uncertain ways in which the fires can impact the operation of the power system. To address this challenge, we propose a novel preventive-corrective coordinated decision-making scheme that quickly mitigates both static and dynamic insecurities given the risk of active wildfires in a region. The scheme utilizes a comprehensive contingency analysis tool for multi-asset outages that leverages: (i) a Feasibility Test algorithm which exhaustively desaturates overloaded cut-sets to prevent cascading line outages, and (ii) a data-driven transient stability analyzer which alleviates dynamic instabilities. This tool is then used to operate a coordinated unit commitment/optimal power flow model that is designed to adapt to varying risk levels associated with wildfires. Depending on the allowed risk, the model balances economical operation and grid robustness. The results obtained using the IEEE 118-bus system indicate that the proposed approach alleviates system vulnerabilities to wildfires while also minimizing operational cost.
comment: Submitted to the Open Access Journal of Power and Energy (OAJPE)
Aerial-based Crisis Management Center (ACMC)
Crisis management (CM) for critical infrastructures, natural disasters such as wildfires and hurricanes, terrorist actions, or civil unrest requires high speed communications and connectivity, and access to high performance computational resources to deliver timely dynamic responses to the crisis being managed by different first responders. CM systems should detect, recognize, and disseminate huge amounts of heterogeneous dynamic events that operate at different speeds and formats. Furthermore, the processing of crisis events and the development of real-time responses are major research challenges when the communications and computational resources needed by CM stakeholders are not available or severely degraded by the crisis. The main goal of the research presented in this paper is to utilize Unmanned Autonomous Systems (UAS) to provide Aerial-based Crisis Management Center (ACMC) that will provide the required communications services and the computational resources that are critically needed by first responders. In our approach to develop an ACMC architecture, we utilize a set of flexible Unmanned Aerial Systems (UAS) that can be dynamically composed to meet the communications and computational requirements of CM tasks. The ACMC services will be modeled as a deep neural network (DNN) mass transport approach to cover a distributed target in a decentralized manner. This is indeed a new decentralized coverage approach with time-varying communication weights. Furthermore, our analysis proves the stability and convergence of the proposed DNN-based mass transport for a team of UAS (e.g., quadcopters), where each quadcopter uses a feedback nonlinear control to independently attain the intended coverage trajectory in a decentralized manner.
Adaptive Invariant Extended Kalman Filter with Noise Covariance Tuning for Attitude Estimation
Attitude estimation is crucial in aerospace engineering, robotics, and virtual reality applications, but faces difficulties due to nonlinear system dynamics and sensor limitations. This paper addresses the challenge of attitude estimation using quaternion-based adaptive right invariant extended Kalman filtering (RI-EKF) that integrates data from inertial and magnetometer sensors. Our approach applies the expectation-maximization (EM) algorithm to estimate noise covariance, exploiting RI-EKF symmetry properties. We analyze the adaptive RI-EKF's stability, convergence, and accuracy, validating its performance through simulations and comparison with the left invariant EKF. Monte Carlo simulations validate the effectiveness of our noise covariance estimation technique across various window lengths.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Equality Constrained Diffusion for Direct Trajectory Optimization
The recent success of diffusion-based generative models in image and natural language processing has ignited interest in diffusion-based trajectory optimization for nonlinear control systems. Existing methods cannot, however, handle the nonlinear equality constraints necessary for direct trajectory optimization. As a result, diffusion-based trajectory optimizers are currently limited to shooting methods, where the nonlinear dynamics are enforced by forward rollouts. This precludes many of the benefits enjoyed by direct methods, including flexible state constraints, reduced numerical sensitivity, and easy initial guess specification. In this paper, we present a method for diffusion-based optimization with equality constraints. This allows us to perform direct trajectory optimization, enforcing dynamic feasibility with constraints rather than rollouts. To the best of our knowledge, this is the first diffusion-based optimization algorithm that supports the general nonlinear equality constraints required for direct trajectory optimization.
Latency Reduction in CloudVR: Cloud Prediction, Edge Correction
Current virtual reality (VR) headsets encounter a trade-off between high processing power and affordability. Consequently, offloading 3D rendering to remote servers helps reduce costs, battery usage, and headset weight. Maintaining network latency below 20ms is crucial to achieving this goal. Predicting future movement and prerendering are beneficial in meeting this tight latency bound. This paper proposes a method that utilizes the low-latency property of edge servers and the high resources available in cloud servers simultaneously to achieve cost-efficient, high-quality VR. In this method, head movement is predicted on the cloud server, and frames are rendered there and transmitted to the edge server. If the prediction error surpasses a threshold, the frame is re-rendered on the edge server. Results demonstrate that using this method, each edge server can efficiently serve up to 23 users concurrently, compared to a maximum of 5 users when rendering the frame entirely on the edge server. Furthermore, this paper shows that employing the Mean Absolute Error loss function and predicting acceleration rather than velocity significantly enhances prediction accuracy. Additionally, it is shown that normalizing individual data using its mean and standard deviation does not yield improvements in prediction accuracy. These findings provide insights into optimizing VR headset performance through cloud-edge collaboration.
comment: Virtual Reality, Edge Computing, Distributed Rendering, Prediction
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment EMNLP 2024
Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. To navigate this challenge, we argue the prominence of grounding LLMs with evident preferences. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives, thereby guiding the model to generate responses that meet the requirements. Our experimental analysis reveals that the aligned models can provide responses that match various preferences among the "3H" (helpfulness, honesty, harmlessness) desiderata. Furthermore, by introducing diverse data and alignment goals, we surpass baseline methods in aligning with single objectives, hence mitigating the impact of the alignment tax and achieving Pareto improvements in multi-objective alignment.
comment: EMNLP 2024 main conference
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Surrogate models are used to predict the behavior of complex energy systems that are too expensive to simulate with traditional numerical methods. Our work introduces the use of language descriptions, which we call "system captions" or SysCaps, to interface with such surrogates. We argue that interacting with surrogates through text, particularly natural language, makes these models more accessible for both experts and non-experts. We introduce a lightweight multimodal text and timeseries regression model and a training pipeline that uses large language models (LLMs) to synthesize high-quality captions from simulation metadata. Our experiments on two real-world simulators of buildings and wind farms show that our SysCaps-augmented surrogates have better accuracy on held-out systems than traditional methods while enjoying new generalization abilities, such as handling semantically related descriptions of the same test system. Additional experiments also highlight the potential of SysCaps to unlock language-driven design space exploration and to regularize training through prompt augmentation.
comment: 21 pages. Under review
Loss of Control Prevention of an Agile Aircraft: Dynamic Command Saturation Approach
The prevention of the loss of control in agile aircraft during the extreme maneuvers is of concern due to the nonlinear aerodynamics and flight dynamics nature of the aircraft in this study. Within this context, the primary objective is to present an architectural framework and elucidate the methodology for its determination. This architecture enables agile maneuvering aircraft to execute more extreme maneuvers while avoiding departure from stable flight, surpassing maneuverability capabilities of conventional state limiters. Hence, the notion of an incremental attainable moment set is introduced for an instantaneous controllability investigation using demanded control moment coefficients derived in the high-level controller, which is the incremental nonlinear dynamic inversion. In the event of detecting a violation of controllability boundaries, Lyapunov-based dynamic command saturation is employed to limit pilot commands, preventing the aircraft from initiating departure from stable flight. As a result, abrupt and excessive pilot inputs are dynamically softened in-flight, and presumable departure tendencies are mitigated. Consequently, the superiority of the proposed method over conventional state limiters is proven through the flight simulations of agile and abrupt maneuvers, as well as Monte Carlo simulations that demonstrate the expansion of stable maneuverable volumes up to 55%.
Safe and Stable Formation Control with Distributed Multi-Agents Using Adaptive Control and Control Barrier Functions
This manuscript considers the problem of ensuring stability and safety during formation control with distributed multi-agent systems in the presence of parametric uncertainty in the dynamics and limited communication. We propose an integrative approach that combines Control Barrier Functions, Adaptive Control, and connected graphs. A reference model is designed so as to ensure a safe and stable formation control strategy. This is combined with a provably correct adaptive control design that includes the use of a CBF-based safety filter that suitably generates safe reference commands. Numerical examples are provided to support the theoretical derivations.
comment: Under Review - American Control Conference 2025
Co-investment with Payoff Sharing Benefit Operators and Users in Network Design
Network-based complex systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users. In this paper, we consider the question of what cooperative mechanisms can benefit both operators and users simultaneously. We address this question in a game theoretical setting, integrating both non-cooperative and cooperative game theory. During the non-cooperative stage, subnetwork decision-makers strategically design their local networks. In the cooperative stage, the co-investment mechanism and the payoff-sharing mechanism are developed to enlarge collective benefits and fairly distribute them. A case study of the Sioux Falls network is conducted to demonstrate the efficiency of the proposed framework. The impact of this interactive network design on environmental sustainability, social welfare and economic efficiency is evaluated, along with an examination of scenarios involving regions with heterogeneous characteristics.
comment: 8 pages, 6 figures
Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration
The emergence of multi-agent reinforcement learning (MARL) is significantly transforming various fields like autonomous vehicle networks. However, real-world multi-agent systems typically contain multiple roles, and the scale of these systems dynamically fluctuates. Consequently, in order to achieve zero-shot scalable collaboration, it is essential that strategies for different roles can be updated flexibly according to the scales, which is still a challenge for current MARL frameworks. To address this, we propose a novel MARL framework named Scalable and Heterogeneous Proximal Policy Optimization (SHPPO), integrating heterogeneity into parameter-shared PPO-based MARL networks. We first leverage a latent network to learn strategy patterns for each agent adaptively. Second, we introduce a heterogeneous layer to be inserted into decision-making networks, whose parameters are specifically generated by the learned latent variables. Our approach is scalable as all the parameters are shared except for the heterogeneous layer, and gains both inter-individual and temporal heterogeneity, allowing SHPPO to adapt effectively to varying scales. SHPPO exhibits superior performance in classic MARL environments like Starcraft Multi-Agent Challenge (SMAC) and Google Research Football (GRF), showcasing enhanced zero-shot scalability, and offering insights into the learned latent variables' impact on team performance by visualization.
Sensory Glove-Based Surgical Robot User Interface ICRA
Robotic surgery has reached a high level of maturity and has become an integral part of standard surgical care. However, existing surgeon consoles are bulky, take up valuable space in the operating room, make surgical team coordination challenging, and their proprietary nature makes it difficult to take advantage of recent technological advances, especially in virtual and augmented reality. One potential area for further improvement is the integration of modern sensory gloves into robotic platforms, allowing surgeons to control robotic arms intuitively with their hand movements. We propose one such system that combines an HTC Vive tracker, a Manus Meta Prime 3 XR sensory glove, and SCOPEYE wireless smart glasses. The system controls one arm of a da Vinci surgical robot. In addition to moving the arm, the surgeon can use fingers to control the end-effector of the surgical instrument. Hand gestures are used to implement clutching and similar functions. In particular, we introduce clutching of the instrument orientation, a functionality unavailable in the da Vinci system. The vibrotactile elements of the glove are used to provide feedback to the user when gesture commands are invoked. A qualitative and quantitative evaluation has been conducted that compares the current device with the dVRK console. The system is shown to have excellent tracking accuracy, and the new interface allows surgeons to perform common surgical training tasks with minimal practice efficiently.
comment: 6 pages, 4 figures, 7 tables, submitted to International Conference on Robotics and Automation (ICRA) 2025
Closed-loop Diffusion Control of Complex Physical Systems
The control problems of complex physical systems have broad applications in science and engineering. Previous studies have shown that generative control methods based on diffusion models offer significant advantages for solving these problems. However, existing generative control approaches face challenges in both performance and efficiency when extended to the closed-loop setting, which is essential for effective control. In this paper, we propose an efficient Closed-Loop Diffusion method for Physical systems Control (CL-DiffPhyCon). By employing an asynchronous denoising framework for different physical time steps, CL-DiffPhyCon generates control signals conditioned on real-time feedback from the environment with significantly reduced computational cost during sampling. Additionally, the control process could be further accelerated by incorporating fast sampling techniques, such as DDIM. We evaluate CL-DiffPhyCon on two tasks: 1D Burgers' equation control and 2D incompressible fluid control. The results demonstrate that CL-DiffPhyCon achieves superior control performance with significant improvements in sampling efficiency.
Auction designs to increase incentive compatibility and reduce self-scheduling in electricity markets
The system operator's scheduling problem in electricity markets, called unit commitment, is a non-convex mixed-integer program. The optimal value function is non-convex, preventing the application of traditional marginal pricing theory to find prices that clear the market and incentivize market participants to follow the dispatch schedule. Units that perceive the opportunity to make a profit may be incentivized to self-commit (submitting an offer with zero fixed operating costs) or self-schedule their production (submitting an offer with zero total cost). We simulate bidder behavior to show that market power can be exercised by self-committing/scheduling. Agents can learn to increase their profits via a reinforcement learning algorithm without explicit knowledge of the costs or strategies of other agents. We investigate different non-convex pricing models over a multi-period commitment window simulating the day-ahead market and show that convex hull pricing can reduce producer incentives to deviate from the central dispatch decision. In a realistic test system with approximately 1000 generators, we find strategic bidding under the restricted convex model can increase total producer profits by 4.4\% and decrease lost opportunity costs by 2/3. While the cost to consumers with convex hull pricing is higher at the competitive solution, the cost to consumers is higher with the restricted convex model after strategic bidding.
comment: Updated author affiliation
On the Sum Secrecy Rate Maximisation for Wireless Vehicular Networks
Wireless communications form the backbone of future vehicular networks, playing a critical role in applications ranging from traffic control to vehicular road safety. However, the dynamic structure of these networks creates security vulnerabilities, making security considerations an integral part of network design. We address these security concerns from a physical layer security aspect by investigating achievable secrecy rates in wireless vehicular networks. Specifically, we aim to maximize the sum secrecy rate from all vehicular pairs subject to bandwidth and power resource constraints. For the considered problem, we first propose a solution based on the successive convex approximation (SCA) method, which has not been applied in this context before. To further reduce the complexity of the SCA-based method, we also propose a low-complexity solution based on a fast iterative shrinkage-thresholding algorithm (FISTA). Our simulation results for SCA and FISTA show a trade-off between convergence and runtime. While the SCA method achieves better convergence, the FISTA-based approach is at least 300 times faster than the SCA method.
Generalized Lyapunov conditions for k-contraction: analysis and feedback design
Recently, the concept of k-contraction has been introduced as a promising generalization of contraction for dynamical systems. However, the study of k-contraction properties has faced significant challenges due to the reliance on complex mathematical objects called matrix compounds. As a result, related control design methodologies have yet to appear in the literature. In this paper, we overcome existing limitations and propose new sufficient conditions for k-contraction which do not require matrix compounds computation. Notably, these conditions are also necessary in the linear time-invariant framework. Leveraging on these findings, we propose a feedback design methodology for both the linear and the nonlinear scenarios which can be used to enforce k-contractivity properties on the closed-loop dynamics.
On Continuous Full-Order Integral-Terminal Sliding Mode Control with Unknown A Priori Bound on Uncertainty
This study aims at providing a solution to the problem of designing a continuous and finite-time control for a class of nonlinear systems in the presence of matched uncertainty with an unknown apriori bound. First, we propose a Full-Order Integral-Terminal Sliding Manifold (FOITSM) with a conventional (discontinuous) sliding mode to show that it provides the combined attributes of the nonsingular terminal and integral sliding mode algorithms. Secondly, an Adaptive Disturbance Observer (ADO) has been designed to alleviate the effect of the uncertainty acting on the system. On application of the ADO-based Full-Order Integral-Terminal Sliding Mode Control (FOITSMC), the chattering phenomenon in control input has been reduced substantially in the presence of conditionally known matched disturbances. Moreover, the adaptive gains of ADO are updated non-monotonically without over-bounding the acting disturbance, yet sustain the global boundedness of state trajectories within a specific bound. %Finally, an application of the proposed algorithm for attitude stabilization of a rigid spacecraft has been successively shown.
comment: 26 pages, 5 figures
A Parallel-in-Time Newton's Method for Nonlinear Model Predictive Control
Model predictive control (MPC) is a powerful framework for optimal control of dynamical systems. However, MPC solvers suffer from a high computational burden that restricts their application to systems with low sampling frequency. This issue is further amplified in nonlinear and constrained systems that require nesting MPC solvers within iterative procedures. In this paper, we address these issues by developing parallel-in-time algorithms for constrained nonlinear optimization problems that take advantage of massively parallel hardware to achieve logarithmic computational time scaling over the planning horizon. We develop time-parallel second-order solvers based on interior point methods and the alternating direction method of multipliers, leveraging fast convergence and lower computational cost per iteration. The parallelization is based on a reformulation of the subproblems in terms of associative operations that can be parallelized using the associative scan algorithm. We validate our approach on numerical examples of nonlinear and constrained dynamical systems.
CaΣoS: A nonlinear sum-of-squares optimization suite
We present Ca$\Sigma$oS, the first MATLAB software specifically designed for nonlinear sum-of-squares optimization. A symbolic polynomial algebra system allows to formulate parametrized sum-of-squares optimization problems and facilitates their fast, repeated evaluations. To that extent, we make use of CasADi's symbolic framework and realize concepts of monomial sparsity, linear operators (including duals), and functions between polynomials. Ca$\Sigma$oS currently provides interfaces to the conic solvers SeDuMi, Mosek, and SCS as well as methods to solve quasiconvex optimization problems (via bisection) and nonconvex optimization problems (via sequential convexification). Numerical examples for benchmark problems including region-of-attraction and reachable set estimation for nonlinear dynamic systems demonstrate significant improvements in computation time compared to existing toolboxes. Ca$\Sigma$oS is available open-source at https://github.com/ifr-acso/casos.
comment: Submitted to 2025 American Control Conference
Robust Data-EnablEd Predictive Leading Cruise Control via Reachability Analysis
Data-driven predictive control promises model-free wave-dampening strategies for Connected and Autonomous Vehicles (CAVs) in mixed traffic flow. However, its performance relies on data quality, which suffers from unknown noise and disturbances. This paper introduces a Robust Data-EnablEd Predictive Leading Cruise Control (RDeeP-LCC) method based on reachability analysis, aiming to achieve safe and optimal CAV control under bounded process noise and external disturbances. Precisely, the matrix zonotope set technique and Willems' Fundamental Lemma are employed to derive the over-approximated system dynamics directly from data, and a data-driven feedback control technique is utilized to obtain an additional feedback input for stability. We decouple the mixed platoon into an error system and a nominal system, where the error system provides data-driven reachability sets for the enhanced safety constraints in the nominal system. Finally, a data-driven predictive control framework is formulated in a tube-based control manner for robustness guarantees. Nonlinear simulations with noise-corrupted data demonstrate that the proposed method outperforms baseline methods in mitigating traffic waves.
comment: 8 pages, 4 figures
Robotics
Learning to Build by Building Your Own Instructions
Structural understanding of complex visual objects is an important unsolved component of artificial intelligence. To study this, we develop a new technique for the recently proposed Break-and-Make problem in LTRON where an agent must learn to build a previously unseen LEGO assembly using a single interactive session to gather information about its components and their structure. We attack this problem by building an agent that we call \textbf{\ours} that is able to make its own visual instruction book. By disassembling an unseen assembly and periodically saving images of it, the agent is able to create a set of instructions so that it has the information necessary to rebuild it. These instructions form an explicit memory that allows the model to reason about the assembly process one step at a time, avoiding the need for long-term implicit memory. This in turn allows us to train on much larger LEGO assemblies than has been possible in the past. To demonstrate the power of this model, we release a new dataset of procedurally built LEGO vehicles that contain an average of 31 bricks each and require over one hundred steps to disassemble and reassemble. We train these models using online imitation learning which allows the model to learn from its own mistakes. Finally, we also provide some small improvements to LTRON and the Break-and-Make problem that simplify the learning environment and improve usability.
M2P2: A Multi-Modal Passive Perception Dataset for Off-Road Mobility in Extreme Low-Light Conditions
Long-duration, off-road, autonomous missions require robots to continuously perceive their surroundings regardless of the ambient lighting conditions. Most existing autonomy systems heavily rely on active sensing, e.g., LiDAR, RADAR, and Time-of-Flight sensors, or use (stereo) visible light imaging sensors, e.g., color cameras, to perceive environment geometry and semantics. In scenarios where fully passive perception is required and lighting conditions are degraded to an extent that visible light cameras fail to perceive, most downstream mobility tasks such as obstacle avoidance become impossible. To address such a challenge, this paper presents a Multi-Modal Passive Perception dataset, M2P2, to enable off-road mobility in low-light to no-light conditions. We design a multi-modal sensor suite including thermal, event, and stereo RGB cameras, GPS, two Inertia Measurement Units (IMUs), as well as a high-resolution LiDAR for ground truth, with a novel multi-sensor calibration procedure that can efficiently transform multi-modal perceptual streams into a common coordinate system. Our 10-hour, 32 km dataset also includes mobility data such as robot odometry and actions and covers well-lit, low-light, and no-light conditions, along with paved, on-trail, and off-trail terrain. Our results demonstrate that off-road mobility is possible through only passive perception in extreme low-light conditions using end-to-end learning and classical planning. The project website can be found at https://cs.gmu.edu/~xiao/Research/M2P2/
Exploring How Non-Prehensile Manipulation Expands Capability in Robots Experiencing Multi-Joint Failure
This work explores non-prehensile manipulation (NPM) and whole-body interaction as strategies for enabling robotic manipulators to conduct manipulation tasks despite experiencing locked multi-joint (LMJ) failures. LMJs are critical system faults where two or more joints become inoperable; they impose constraints on the robot's configuration and control spaces, consequently limiting the capability and reach of a prehensile-only approach. This approach involves three components: i) modeling the failure-constrained workspace of the robot, ii) generating a kinodynamic map of NPM actions within this workspace, and iii) a manipulation action planner that uses a sim-in-the-loop approach to select the best actions to take from the kinodynamic map. The experimental evaluation shows that our approach can increase the failure-constrained reachable area in LMJ cases by 79%. Further, it demonstrates the ability to complete real-world manipulation with up to 88.9% success when the end-effector is unusable and up to 100% success when it is usable.
comment: To be published in the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems
RoTip: A Finger-Shaped Tactile Sensor with Active Rotation
In recent years, advancements in optical tactile sensor technology have primarily centred on enhancing sensing precision and expanding the range of sensing modalities. To meet the requirements for more skilful manipulation, there should be a movement towards making tactile sensors more dynamic. In this paper, we introduce RoTip, a novel vision-based tactile sensor that is uniquely designed with an independently controlled joint and the capability to sense contact over its entire surface. The rotational capability of the sensor is particularly crucial for manipulating everyday objects, especially thin and flexible ones, as it enables the sensor to mobilize while in contact with the object's surface. The manipulation experiments demonstrate the ability of our proposed RoTip to manipulate rigid and flexible objects, and the full-finger tactile feedback and active rotation capabilities have the potential to explore more complex and precise manipulation tasks.
Two-Finger Soft Gripper Force Modulation via Kinesthetic Feedback
We investigate a method to modulate contact forces between the soft fingers of a two-finger gripper and an object, without relying on tactile sensors. This work is a follow-up to our previous results on contact detection. Here, our hypothesis is that once the contact between a finger and an object is detected, a controller that keeps a desired difference between the finger bending measurement and its bending at the moment of contact is sufficient to maintain and modulate the contact force. This approach can be simultaneously applied to both fingers while getting in contact with a single object. We successfully tested the hypothesis, and characterized the contact and peak pull-out force magnitude vs. the desired difference expressed by a multiplicative factor. All of the results are performed on a real physical device.
An Approach to Elicit Human-Understandable Robot Expressions to Support Human-Robot Interaction
Understanding the intentions of robots is essential for natural and seamless human-robot collaboration. Ensuring that robots have means for non-verbal communication is a basis for intuitive and implicit interaction. For this, we contribute an approach to elicit and design human-understandable robot expressions. We outline the approach in the context of non-humanoid robots. We paired human mimicking and enactment with research from gesture elicitation in two phases: first, to elicit expressions, and second, to ensure they are understandable. We present an example application through two studies (N=16 \& N=260) of our approach to elicit expressions for a simple 6-DoF robotic arm. We show that it enabled us to design robot expressions that signal curiosity and interest in getting attention. Our main contribution is an approach to generate and validate understandable expressions for robots, enabling more natural human-robot interaction.
Effective self-righting strategies for elongate multi-legged robots
Centipede-like robots offer an effective and robust solution to navigation over complex terrain with minimal sensing. However, when climbing over obstacles, such multi-legged robots often elevate their center-of-mass into unstable configurations, where even moderate terrain uncertainty can cause tipping over. Robust mechanisms for such elongate multi-legged robots to self-right remain unstudied. Here, we developed a comparative biological and robophysical approach to investigate self-righting strategies. We first released \textit{S. polymorpha} upside down from a 10 cm height and recorded their self-righting behaviors using top and side view high-speed cameras. Using kinematic analysis, we hypothesize that these behaviors can be prescribed by two traveling waves superimposed in the body lateral and vertical planes, respectively. We tested our hypothesis on an elongate robot with static (non-actuated) limbs, and we successfully reconstructed these self-righting behaviors. We further evaluated how wave parameters affect self-righting effectiveness. We identified two key wave parameters: the spatial frequency, which characterizes the sequence of body-rolling, and the wave amplitude, which characterizes body curvature. By empirically obtaining a behavior diagram of spatial frequency and amplitude, we identify effective and versatile self-righting strategies for general elongate multi-legged robots, which greatly enhances these robots' mobility and robustness in practical applications such as agricultural terrain inspection and search-and-rescue.
Divide et Impera: Learning impedance families for peg-in-hole assembly
This paper addresses robotic peg-in-hole assembly using the framework of Elementary Dynamic Actions (EDA). Inspired by motor primitives in neuromotor control research, the method leverages three primitives: submovements, oscillations, and mechanical impedances (e.g., stiffness and damping), combined via a Norton equivalent network model. By focusing on impedance parameterization, we explore the adaptability of EDA in contact-rich tasks. Experimental results, conducted on a real robot setup with four different peg types, demonstrated a range of successful impedance parameters, challenging conventional methods that seek optimal parameters. We analyze our data in a lower-dimensional solution space. Clustering analysis shows the possibility to identify different individual strategies for each single peg, as well as common strategies across all pegs. A neural network model, trained on the experimental data, accurately predicted successful impedance parameters across all pegs. The practical utility of this work is enhanced by a success-predictor model and the public availability of all code and CAD files. These findings highlight the flexibility and robustness of EDA; show multiple equally-successful strategies for contact-rich manipulation; and offer valuable insights and tools for robotic assembly programming.
comment: 18 pages, 11 figures
Steering Elongate Multi-legged Robots By Modulating Body Undulation Waves
Centipedes exhibit great maneuverability in diverse environments due to their many legs and body-driven control. By leveraging similar morphologies, their robotic counterparts also demonstrate effective terrestrial locomotion. However, the success of these multi-legged robots is largely limited to forward locomotion; steering is substantially less studied, in part due to the challenges in coordinating their many body joints. Furthermore, steering behavior is complex and can include different combinations of desired rotational/translational displacement. In this paper, we explore steering strategies in multi-legged robots based on tools derived from geometric mechanics (GM). We characterize the steering motion in the plane by the rotation angle, the steering radius, and the heading direction angle. We identify an effective turning strategy by superimposing two traveling waves in the lateral body undulation and further explore variations of the "turning wave" to enable a broad spectrum of steering behaviors. By combining an amplitude modulation and a phase modulation, we develop a control strategy for steering behaviors that enables steering with a range of rotation angles (from 0{\deg} to 20{\deg}) and steering radius (from 0.28 to 0.38 body length) while keeping the heading direction angle close to 0. Lastly, we test our control framework on an elongate multi-legged robot model to verify the effectiveness of our proposed strategy. Our work demonstrates the generality of the two-wave template for effective steering of multi-legged elongate robots.
Addition of a peristaltic wave improves multi-legged locomotion performance on complex terrains
Characterized by their elongate bodies and relatively simple legs, multi-legged robots have the potential to locomote through complex terrains for applications such as search-and-rescue and terrain inspection. Prior work has developed effective and reliable locomotion strategies for multi-legged robots by propagating the two waves of lateral body undulation and leg stepping, which we will refer to as the two-wave template. However, these robots have limited capability to climb over obstacles with sizes comparable to their heights. We hypothesize that such limitations stem from the two-wave template that we used to prescribe the multi-legged locomotion. Seeking effective alternative waves for obstacle-climbing, we designed a five-segment robot with static (non-actuated) legs, where each cable-driven joint has a rotational degree-of-freedom (DoF) in the sagittal plane (vertical wave) and a linear DoF (peristaltic wave). We tested robot locomotion performance on a flat terrain and a rugose terrain. While the benefit of peristalsis on flat-ground locomotion is marginal, the inclusion of a peristaltic wave substantially improves the locomotion performance in rugose terrains: it not only enables obstacle-climbing capabilities with obstacles having a similar height as the robot, but it also significantly improves the traversing capabilities of the robot in such terrains. Our results demonstrate an alternative actuation mechanism for multi-legged robots, paving the way towards all-terrain multi-legged robots.
Safe Autonomy for Uncrewed Surface Vehicles Using Adaptive Control and Reachability Analysis
Marine robots must maintain precise control and ensure safety during tasks like ocean monitoring, even when encountering unpredictable disturbances that affect performance. Designing algorithms for uncrewed surface vehicles (USVs) requires accounting for these disturbances to control the vehicle and ensure it avoids obstacles. While adaptive control has addressed USV control challenges, real-world applications are limited, and certifying USV safety amidst unexpected disturbances remains difficult. To tackle control issues, we employ a model reference adaptive controller (MRAC) to stabilize the USV along a desired trajectory. For safety certification, we developed a reachability module with a moving horizon estimator (MHE) to estimate disturbances affecting the USV. This estimate is propagated through a forward reachable set calculation, predicting future states and enabling real-time safety certification. We tested our safe autonomy pipeline on a Clearpath Heron USV in the Charles River, near MIT. Our experiments demonstrated that the USV's MRAC controller and reachability module could adapt to disturbances like thruster failures and drag forces. The MRAC controller outperformed a PID baseline, showing a 45%-81% reduction in RMSE position error. Additionally, the reachability module provided real-time safety certification, ensuring the USV's safety. We further validated our pipeline's effectiveness in underway replenishment and canal scenarios, simulating relevant marine tasks.
comment: 35 pages, 23 figures, 6 tables
Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks ICRA 2025
Mastering complex sequential tasks continues to pose a significant challenge in robotics. While there has been progress in learning long-horizon manipulation tasks, most existing approaches lack rigorous mathematical guarantees for ensuring reliable and successful execution. In this paper, we extend previous work on learning long-horizon tasks and stable policies, focusing on improving task success rates while reducing the amount of training data needed. Our approach introduces a novel method that (1) segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals, and (2) learns globally stable dynamical system policies to guide the robot to each subgoal, even in the face of sensory noise and random disturbances. We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms. Code is available at https://github.com/Alestaubin/stable-imitation-policy-with-waypoints
comment: 7 pages, submitted to ICRA 2025
Dynamic Bipedal Loco-manipulation using Oracle Guided Multi-mode Policies with Mode-transition Preference
Loco-manipulation calls for effective whole-body control and contact-rich interactions with the object and the environment. Existing learning-based control frameworks rely on task-specific engineered rewards, training a set of low-level skill policies and explicitly switching between them with a high-level policy or FSM, leading to quasi-static and fragile transitions between skills. In contrast, for solving highly dynamic tasks such as soccer, the robot should run towards the ball, decelerating into an optimal approach configuration to seamlessly switch to dribbling and eventually score a goal - a continuum of smooth motion. To this end, we propose to learn a single Oracle Guided Multi-mode Policy (OGMP) for mastering all the required modes and transition maneuvers to solve uni-object bipedal loco-manipulation tasks. Specifically, we design a multi-mode oracle as a closed loop state-reference generator, viewing it as a hybrid automaton with continuous reference generating dynamics and discrete mode jumps. Given such an oracle, we then train an OGMP through bounded exploration around the generated reference. Furthermore, to enforce the policy to learn the desired sequence of mode transitions, we present a novel task-agnostic mode-switching preference reward that enhances performance. The proposed approach results in successful dynamic loco-manipulation in omnidirectional soccer and box-moving tasks with a 16-DoF bipedal robot HECTOR. Supplementary video results are available at https://www.youtube.com/watch?v=gfDaRqobheg
comment: 7 pages, 6 figures
Risk-Averse Planning and Plan Assessment for Marine Robots
Autonomous Underwater Vehicles (AUVs) need to operate for days without human intervention and thus must be able to do efficient and reliable task planning. Unfortunately, efficient task planning requires deliberately abstract domain models (for scalability reasons), which in practice leads to plans that might be unreliable or under performing in practice. An optimal abstract plan may turn out suboptimal or unreliable during physical execution. To overcome this, we introduce a method that first generates a selection of diverse high-level plans and then assesses them in a low-level simulation to select the optimal and most reliable candidate. We evaluate the method using a realistic underwater robot simulation, estimating the risk metrics for different scenarios, demonstrating feasibility and effectiveness of the approach.
comment: 6 pages, 6 figures, IEEE International Conference on Intelligent Robots and Systems 2024
Diffusion-Informed Probabilistic Contact Search for Multi-Finger Manipulation
Planning contact-rich interactions for multi-finger manipulation is challenging due to the high-dimensionality and hybrid nature of dynamics. Recent advances in data-driven methods have shown promise, but are sensitive to the quality of training data. Combining learning with classical methods like trajectory optimization and search adds additional structure to the problem and domain knowledge in the form of constraints, which can lead to outperforming the data on which models are trained. We present Diffusion-Informed Probabilistic Contact Search (DIPS), which uses an A* search to plan a sequence of contact modes informed by a diffusion model. We train the diffusion model on a dataset of demonstrations consisting of contact modes and trajectories generated by a trajectory optimizer given those modes. In addition, we use a particle filter-inspired method to reason about variability in diffusion sampling arising from model error, estimating likelihoods of trajectories using a learned discriminator. We show that our method outperforms ablations that do not reason about variability and can plan contact sequences that outperform those found in training data across multiple tasks. We evaluate on simulated tabletop card sliding and screwdriver turning tasks, as well as the screwdriver task in hardware to show that our combined learning and planning approach transfers to the real world.
Adaptive Motion Generation Using Uncertainty-Driven Foresight Prediction
Uncertainty of environments has long been a difficult characteristic to handle, when performing real-world robot tasks. This is because the uncertainty produces unexpected observations that cannot be covered by manual scripting. Learning based robot controlling methods are a promising approach for generating flexible motions against unknown situations, but still tend to suffer under uncertainty due to its deterministic nature. In order to adaptively perform the target task under such conditions, the robot control model must be able to accurately understand the possible uncertainty, and to exploratively derive the optimal action that minimizes such uncertainty. This paper extended an existing predictive learning based robot control method, which employ foresight prediction using dynamic internal simulation. The foresight module refines the model's hidden states by sampling multiple possible futures and replace with the one that led to the lower future uncertainty. The adaptiveness of the model was evaluated on a door opening task. The door can be opened either by pushing, pulling, or sliding, but robot cannot visually distinguish which way, and is required to adapt on the fly. The results showed that the proposed model adaptively diverged its motion through interaction with the door, whereas conventional methods failed to stably diverge. The models were analyzed on Lyapunov exponents of RNN hidden states which reflect the possible divergence at each time step during task execution. The result indicated that the foresight module biased the model to consider future consequences, which lead to embedding uncertainties at the policy of the robot controller, rather than the resultant observation. This is beneficial for implementing adaptive behaviors, which indices derivation of diverse motion during exploration.
Under Pressure: Altimeter-Aided ICP for 3D Maps Consistency ICRA25
We propose a novel method to enhance the accuracy of the Iterative Closest Point (ICP) algorithm by integrating altitude constraints from a barometric pressure sensor. While ICP is widely used in mobile robotics for Simultaneous Localization and Mapping ( SLAM ), it is susceptible to drift, especially in underconstrained environments such as vertical shafts. To address this issue, we propose to augment ICP with altimeter measurements, reliably constraining drifts along the gravity vector. To demonstrate the potential of altimetry in SLAM , we offer an analysis of calibration procedures and noise sensitivity of various pressure sensors, improving measurements to centimeter-level accuracy. Leveraging this accuracy, we propose a novel ICP formulation that integrates altitude measurements along the gravity vector, thus simplifying the optimization problem to 3-Degree Of Freedom (DOF). Experimental results from real-world deployments demonstrate that our method reduces vertical drift by 84% and improves overall localization accuracy compared to state-of-the-art methods in non-planar environments.
comment: 6 pages + references, 5 figures, submitted to ICRA25
Collaborative motion planning for multi-manipulator systems through Reinforcement Learning and Dynamic Movement Primitives
Robotic tasks often require multiple manipulators to enhance task efficiency and speed, but this increases complexity in terms of collaboration, collision avoidance, and the expanded state-action space. To address these challenges, we propose a multi-level approach combining Reinforcement Learning (RL) and Dynamic Movement Primitives (DMP) to generate adaptive, real-time trajectories for new tasks in dynamic environments using a demonstration library. This method ensures collision-free trajectory generation and efficient collaborative motion planning. We validate the approach through experiments in the PyBullet simulation environment with UR5e robotic manipulators.
comment: 6 pages, 6 figures, conference submission
Optimizing Drug Delivery in Smart Pharmacies: A Novel Framework of Multi-Stage Grasping Network Combined with Adaptive Robotics Mechanism
Robots-based smart pharmacies are essential for modern healthcare systems, enabling efficient drug delivery. However, a critical challenge exists in the robotic handling of drugs with varying shapes and overlapping positions, which previous studies have not adequately addressed. To enhance the robotic arm's ability to grasp chaotic, overlapping, and variously shaped drugs, this paper proposed a novel framework combining a multi-stage grasping network with an adaptive robotics mechanism. The framework first preprocessed images using an improved Super-Resolution Convolutional Neural Network (SRCNN) algorithm, and then employed the proposed YOLOv5+E-A-SPPFCSPC+BIFPNC (YOLO-EASB) instance segmentation algorithm for precise drug segmentation. The most suitable drugs for grasping can be determined by assessing the completeness of the segmentation masks. Then, these segmented drugs were processed by our improved Adaptive Feature Fusion and Grasp-Aware Network (IAFFGA-Net) with the optimized loss function, which ensures accurate picking actions even in complex environments. To control the robot grasping, a time-optimal robotic arm trajectory planning algorithm that combines an improved ant colony algorithm with 3-5-3 interpolation was developed, further improving efficiency while ensuring smooth trajectories. Finally, this system was implemented and validated within an adaptive collaborative robot setup, which dynamically adjusts to different production environments and task requirements. Experimental results demonstrate the superiority of our multi-stage grasping network in optimizing smart pharmacy operations, while also showcasing its remarkable adaptability and effectiveness in practical applications.
Radar Meets Vision: Robustifying Monocular Metric Depth Prediction for Mobile Robotics ICRA 2025
Mobile robots require accurate and robust depth measurements to understand and interact with the environment. While existing sensing modalities address this problem to some extent, recent research on monocular depth estimation has leveraged the information richness, yet low cost and simplicity of monocular cameras. These works have shown significant generalization capabilities, mainly in automotive and indoor settings. However, robots often operate in environments with limited scale cues, self-similar appearances, and low texture. In this work, we encode measurements from a low-cost mmWave radar into the input space of a state-of-the-art monocular depth estimation model. Despite the radar's extreme point cloud sparsity, our method demonstrates generalization and robustness across industrial and outdoor experiments. Our approach reduces the absolute relative error of depth predictions by 9-64% across a range of unseen, real-world validation datasets. Importantly, we maintain consistency of all performance metrics across all experiments and scene depths where current vision-only approaches fail. We further address the present deficit of training data in mobile robotics environments by introducing a novel methodology for synthesizing rendered, realistic learning datasets based on photogrammetric data that simulate the radar sensor observations for training. Our code, datasets, and pre-trained networks are made available at https://github.com/ethz-asl/radarmeetsvision.
comment: Submitted to ICRA 2025
A Low-Cost, High-Speed, and Robust Bin Picking System for Factory Automation Enabled by a Non-Stop, Multi-View, and Active Vision Scheme
Bin picking systems in factory automation usually face robustness issues caused by sparse and noisy 3D data of metallic objects. Utilizing multiple views, especially with a one-shot 3D sensor and "sensor on hand" configuration is getting more popularity due to its effectiveness, flexibility, and low cost. While moving the 3D sensor to acquire multiple views for 3D fusion, joint optimization, or active vision suffers from low-speed issues. That is because sensing is taken as a decoupled module from motion tasks and is not intentionally designed for a bin picking system. To address the problems, we designed a bin picking system, which tightly couples a multi-view, active vision scheme with motion tasks in a "sensor on hand" configuration. It not only speeds up the system by parallelizing the high-speed sensing scheme to the robot place action but also decides the next sensing path to maintain the continuity of the whole picking process. Unlike others focusing only on sensing evaluation, we also evaluated our design by picking experiments on 5 different types of objects without human intervention. Our experiments show the whole sensing scheme can be finished within 1.682 seconds (maximum) on CPU and the average picking complete rate is over 97.75%. Due to the parallelization with robot motion, the sensing scheme accounts for only 0.635 seconds in takt time on average.
E-MPC: Edge-assisted Model Predictive Control
Model predictive control (MPC) has become the de facto standard action space for local planning and learning-based control in many continuous robotic control tasks, including autonomous driving. MPC solves a long-horizon cost optimization as a series of short-horizon optimizations based on a global planner-supplied reference path. The primary challenge in MPC, however, is that the computational budget for re-planning has a hard limit, which frequently inhibits exact optimization. Modern edge networks provide low-latency communication and heterogeneous properties that can be especially beneficial in this situation. We propose a novel framework for edge-assisted MPC (E-MPC) for path planning that exploits the heterogeneity of edge networks in three important ways: 1) varying computational capacity, 2) localized sensor information, and 3) localized observation histories. Theoretical analysis and extensive simulations are undertaken to demonstrate quantitatively the benefits of E-MPC in various scenarios, including maps, channel dynamics, and availability and density of edge nodes. The results confirm that E-MPC has the potential to reduce costs by a greater percentage than standard MPC does.
Multimodal Coherent Explanation Generation of Robot Failures
The explainability of a robot's actions is crucial to its acceptance in social spaces. Explaining why a robot fails to complete a given task is particularly important for non-expert users to be aware of the robot's capabilities and limitations. So far, research on explaining robot failures has only considered generating textual explanations, even though several studies have shown the benefits of multimodal ones. However, a simple combination of multiple modalities may lead to semantic incoherence between the information across different modalities - a problem that is not well-studied. An incoherent multimodal explanation can be difficult to understand, and it may even become inconsistent with what the robot and the human observe and how they perform reasoning with the observations. Such inconsistencies may lead to wrong conclusions about the robot's capabilities. In this paper, we introduce an approach to generate coherent multimodal explanations by checking the logical coherence of explanations from different modalities, followed by refinements as required. We propose a classification approach for coherence assessment, where we evaluate if an explanation logically follows another. Our experiments suggest that fine-tuning a neural network that was pre-trained to recognize textual entailment, performs well for coherence assessment of multimodal explanations. Code & data: https://pradippramanick.github.io/coherent-explain/.
LASMP: Language Aided Subset Sampling Based Motion Planner
This paper presents the Language Aided Subset Sampling Based Motion Planner (LASMP), a system that helps mobile robots plan their movements by using natural language instructions. LASMP uses a modified version of the Rapidly Exploring Random Tree (RRT) method, which is guided by user-provided commands processed through a language model (RoBERTa). The system improves efficiency by focusing on specific areas of the robot's workspace based on these instructions, making it faster and less resource-intensive. Compared to traditional RRT methods, LASMP reduces the number of nodes needed by 55% and cuts random sample queries by 80%, while still generating safe, collision-free paths. Tested in both simulated and real-world environments, LASMP has shown better performance in handling complex indoor scenarios. The results highlight the potential of combining language processing with motion planning to make robot navigation more efficient.
comment: 8 pages, 9 figures
Can We Remove the Ground? Obstacle-aware Point Cloud Compression for Remote Object Detection ICRA 2025
Efficient point cloud (PC) compression is crucial for streaming applications, such as augmented reality and cooperative perception. Classic PC compression techniques encode all the points in a frame. Tailoring compression towards perception tasks at the receiver side, we ask the question, "Can we remove the ground points during transmission without sacrificing the detection performance?" Our study reveals a strong dependency on the ground from state-of-the-art (SOTA) 3D object detection models, especially on those points below and around the object. In this work, we propose a lightweight obstacle-aware Pillar-based Ground Removal (PGR) algorithm. PGR filters out ground points that do not provide context to object recognition, significantly improving compression ratio without sacrificing the receiver side perception performance. Not using heavy object detection or semantic segmentation models, PGR is light-weight, highly parallelizable, and effective. Our evaluations on KITTI and Waymo Open Dataset show that SOTA detection models work equally well with PGR removing 20-30% of the points, with a speeding of 86 FPS.
comment: 7 Pages; submitted to ICRA 2025
Obstacle-Avoidant Leader Following with a Quadruped Robot
Personal mobile robotic assistants are expected to find wide applications in industry and healthcare. For example, people with limited mobility can benefit from robots helping with daily tasks, or construction workers can have robots perform precision monitoring tasks on-site. However, manually steering a robot while in motion requires significant concentration from the operator, especially in tight or crowded spaces. This reduces walking speed, and the constant need for vigilance increases fatigue and, thus, the risk of accidents. This work presents a virtual leash with which a robot can naturally follow an operator. We use a sensor fusion based on a custom-built RF transponder, RGB cameras, and a LiDAR. In addition, we customize a local avoidance planner for legged platforms, which enables us to navigate dynamic and narrow environments. We successfully validate on the ANYmal platform the robustness and performance of our entire pipeline in real-world experiments.
Design and Identification of Keypoint Patches in Unstructured Environments
Reliable perception of targets is crucial for the stable operation of autonomous robots. A widely preferred method is keypoint identification in an image, as it allows direct mapping from raw images to 2D coordinates, facilitating integration with other algorithms like localization and path planning. In this study, we closely examine the design and identification of keypoint patches in cluttered environments, where factors such as blur and shadows can hinder detection. We propose four simple yet distinct designs that consider various scale, rotation and camera projection using a limited number of pixels. Additionally, we customize the Superpoint network to ensure robust detection under various types of image degradation. The effectiveness of our approach is demonstrated through real-world video tests, highlighting potential for vision-based autonomous systems.
comment: 12 pages, 8 figures, 7 tables
Human-Robot Collaborative Minimum Time Search through Sub-priors in Ant Colony Optimization
Human-Robot Collaboration (HRC) has evolved into a highly promising issue owing to the latest breakthroughs in Artificial Intelligence (AI) and Human-Robot Interaction (HRI), among other reasons. This emerging growth increases the need to design multi-agent algorithms that can manage also human preferences. This paper presents an extension of the Ant Colony Optimization (ACO) meta-heuristic to solve the Minimum Time Search (MTS) task, in the case where humans and robots perform an object searching task together. The proposed model consists of two main blocks. The first one is a convolutional neural network (CNN) that provides the prior probabilities about where an object may be from a segmented image. The second one is the Sub-prior MTS-ACO algorithm (SP-MTS-ACO), which takes as inputs the prior probabilities and the particular search preferences of the agents in different sub-priors to generate search plans for all agents. The model has been tested in real experiments for the joint search of an object through a Vizanti web-based visualization in a tablet computer. The designed interface allows the communication between a human and our humanoid robot named IVO. The obtained results show an improvement in the search perception of the users without loss of efficiency.
A five-bar mechanism to assist finger flexion-extension movement: system implementation
The lack of specialized personnel and assistive technology to assist in rehabilitation therapies is one of the challenges facing the health sector today, and it is projected to increase. For researchers and engineers, it represents an opportunity to innovate and develop devices that improve and optimize rehabilitation services for the benefit of society. Among the different types of injuries, hand injuries occur most frequently. These injuries require a rehabilitation process in order for the hand to regain its functionality. This article presents the fabrication and instrumentation of an end-effector prototype, based on a five-bar configuration, for finger rehabilitation that executes a natural flexion-extension movement. The dimensions were obtained through the gradient method optimization and evaluated through Matlab. Experimental tests were carried out to demonstrate the prototype's functionality and the effectiveness of a five-bar mechanism acting in a vertical plane, where gravity influences the mechanism's performance. Position control using fifth-order polynomials with via points was implemented in the joint space. The design of the end-effector was also evaluated by performing a theoretical comparison, calculated as a function of a real flexion-extension trajectory of the fingers and the angle of rotation obtained through an IMU. As a result, controlling the two degrees of freedom of the mechanism at several points of the trajectory assures the end-effector trajectory and therefore the fingers' range of motion, which helps for full patient recovery.
Design and construction of a wireless robot that simulates head movements in cone beam computed tomography imaging
One of the major challenges in the science of maxillofacial radiology imaging is the various artifacts created in images taken by cone beam computed tomography (CBCT) imaging systems. Among these artifacts, motion artifact, which is created by the patient, has adverse effects on image quality. In this paper, according to the conditions and limitations of the CBCT imaging room, the goal is the design and development of a cable-driven parallel robot to create repeatable movements of a dry skull inside a CBCT scanner for studying motion artifacts and building up reference datasets with motion artifacts. The proposed robot allows a dry skull to execute motions, which were selected on the basis of clinical evidence, with 3-degrees of freedom during imaging in synchronous manner with the radiation beam. The kinematic model of the robot is presented to investigate and describe the correlation between the amount of motion and the pulse width applied to DC motors. This robot can be controlled by the user through a smartphone or laptop wirelessly via a Wi-Fi connection. Using wireless communication protects the user from harmful radiation during robot driving and functioning. The results show that the designed robot has a reproducibility above 95% in performing various movements.
Learning Adaptive Hydrodynamic Models Using Neural ODEs in Complex Conditions
Reinforcement learning-based quadruped robots excel across various terrains but still lack the ability to swim in water due to the complex underwater environment. This paper presents the development and evaluation of a data-driven hydrodynamic model for amphibious quadruped robots, aiming to enhance their adaptive capabilities in complex and dynamic underwater environments. The proposed model leverages Neural Ordinary Differential Equations (ODEs) combined with attention mechanisms to accurately process and interpret real-time sensor data. The model enables the quadruped robots to understand and predict complex environmental patterns, facilitating robust decision-making strategies. We harness real-time sensor data, capturing various environmental and internal state parameters to train and evaluate our model. A significant focus of our evaluation involves testing the quadruped robot's performance across different hydrodynamic conditions and assessing its capabilities at varying speeds and fluid dynamic conditions. The outcomes suggest that the model can effectively learn and adapt to varying conditions, enabling the prediction of force states and enhancing autonomous robotic behaviors in various practical scenarios.
comment: 8 pages, 7 figures
RobotGraffiti: An AR tool for semi-automated construction of workcell models to optimize robot deployment IROS 2024
Improving robot deployment is a central step towards speeding up robot-based automation in manufacturing. A main challenge in robot deployment is how to best place the robot within the workcell. To tackle this challenge, we combine two knowledge sources: robotic knowledge of the system and workcell context awareness of the user, and intersect them with an Augmented Reality interface. RobotGraffiti is a unique tool that empowers the user in robot deployment tasks. One simply takes a 3D scan of the workcell with their mobile device, adds contextual data points that otherwise would be difficult to infer from the system, and receives a robot base position that satisfies the automation task. The proposed approach is an alternative to expensive and time-consuming digital twins, with a fast and easy-to-use tool that focuses on selected workcell features needed to run the placement optimization algorithm. The main contributions of this paper are the novel user interface for robot base placement data collection and a study comparing the traditional offline simulation with our proposed method. We showcase the method with a robot base placement solution and obtain up to 16 times reduction in time.
comment: Accepted in IROS 2024
Fast Hip Joint Moment Estimation with A General Moment Feature Generation Method
The hip joint moment during walking is a crucial basis for hip exoskeleton control. Compared to generating assistive torque profiles based on gait estimation, estimating hip joint moment directly using hip joint angles offers advantages such as simplified sensing and adaptability to variable walking speeds. Existing methods that directly estimate moment from hip joint angles are mainly used for offline biomechanical estimation. However, they suffer from long computation time and lack of personalization, rendering them unsuitable for personalized control of hip exoskeletons. To address these challenges, this paper proposes a fast hip joint moment estimation method based on generalized moment features (GMF). The method first employs a GMF generator to learn a feature representation of joint moment, namely the proposed GMF, which is independent of individual differences. Subsequently, a GRU-based neural network with fast computational performance is trained to learn the mapping from the joint kinematics to the GMF. Finally, the predicted GMF is decoded into the joint moment with a GMF decoder. The joint estimation model is trained and tested on a dataset comprising 20 subjects under 28 walking speed conditions. Results show that the proposed method achieves a root mean square error of 0.1180 $\pm$ 0.0021 Nm/kg for subjects in test dataset, and the computation time per estimation using the employed GRU-based estimator is 1.3420 $\pm$ 0.0031 ms, significantly faster than mainstream neural network architectures, while maintaining comparable network accuracy. These promising results demonstrate that the proposed method enhances the accuracy and computational speed of joint moment estimation neural networks, with potential for guiding exoskeleton control.
Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations
In this study, we consider the problem of predicting task success for open-vocabulary manipulation by a manipulator, based on instruction sentences and egocentric images before and after manipulation. Conventional approaches, including multimodal large language models (MLLMs), often fail to appropriately understand detailed characteristics of objects and/or subtle changes in the position of objects. We propose Contrastive $\lambda$-Repformer, which predicts task success for table-top manipulation tasks by aligning images with instruction sentences. Our method integrates the following three key types of features into a multi-level aligned representation: features that preserve local image information; features aligned with natural language; and features structured through natural language. This allows the model to focus on important changes by looking at the differences in the representation between two images. We evaluate Contrastive $\lambda$-Repformer on a dataset based on a large-scale standard dataset, the RT-1 dataset, and on a physical robot platform. The results show that our approach outperformed existing approaches including MLLMs. Our best model achieved an improvement of 8.66 points in accuracy compared to the representative MLLM-based model.
comment: Accepted for presentation at CoRL2024
Deceptive Risks in LLM-enhanced Robots
This case study investigates a critical glitch in the integration of Large Language Models (LLMs) into social robots. LLMs, including ChatGPT, were found to falsely claim to have reminder functionalities, such as setting notifications for medication intake. We tested commercially available care software, which integrated ChatGPT, running on the Pepper robot and consistently reproduced this deceptive pattern. Not only did the system falsely claim the ability to set reminders, but it also proactively suggested managing medication schedules. The persistence of this issue presents a significant risk in healthcare settings, where system reliability is paramount. This case highlights the ethical and safety concerns surrounding the deployment of LLM-integrated robots in healthcare, emphasizing the urgent need for regulatory oversight to prevent potentially harmful consequences for vulnerable populations.
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI
Simulation has enabled unprecedented compute-scalable approaches to robot learning. However, many existing simulation frameworks typically support a narrow range of scenes/tasks and lack features critical for scaling generalizable robotics and sim2real. We introduce and open source ManiSkill3, the fastest state-visual GPU parallelized robotics simulator with contact-rich physics targeting generalizable manipulation. ManiSkill3 supports GPU parallelization of many aspects including simulation+rendering, heterogeneous simulation, pointclouds/voxels visual input, and more. Simulation with rendering on ManiSkill3 can run 10-1000x faster with 2-3x less GPU memory usage than other platforms, achieving up to 30,000+ FPS in benchmarked environments due to minimal python/pytorch overhead in the system, simulation on the GPU, and the use of the SAPIEN parallel rendering system. Tasks that used to take hours to train can now take minutes. We further provide the most comprehensive range of GPU parallelized environments/tasks spanning 12 distinct domains including but not limited to mobile manipulation for tasks such as drawing, humanoids, and dextrous manipulation in realistic scenes designed by artists or real-world digital twins. In addition, millions of demonstration frames are provided from motion planning, RL, and teleoperation. ManiSkill3 also provides a comprehensive set of baselines that span popular RL and learning-from-demonstrations algorithms.
comment: Project website: http://maniskill.ai/
Find Everything: A General Vision Language Model Approach to Multi-Object Search ICRA2025
The Multi-Object Search (MOS) problem involves navigating to a sequence of locations to maximize the likelihood of finding target objects while minimizing travel costs. In this paper, we introduce a novel approach to the MOS problem, called Finder, which leverages vision language models (VLMs) to locate multiple objects across diverse environments. Specifically, our approach introduces multi-channel score maps to track and reason about multiple objects simultaneously during navigation, along with a score fusion technique that combines scene-level and object-level semantic correlations. Experiments in both simulated and real-world settings showed that Finder outperforms existing methods using deep reinforcement learning and VLMs. Ablation and scalability studies further validated our design choices and robustness with increasing numbers of target objects, respectively. Website: https://find-all-my-things.github.io/
comment: 12 pages, 6 figures, submitted to ICRA2025
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
Robotic manipulation in open-world settings requires not only task execution but also the ability to detect and learn from failures. While recent advances in vision-language models (VLMs) and large language models (LLMs) have improved robots' spatial reasoning and problem-solving abilities, they still struggle with failure recognition, limiting their real-world applicability. We introduce AHA, an open-source VLM designed to detect and reason about failures in robotic manipulation using natural language. By framing failure detection as a free-form reasoning task, AHA identifies failures and provides detailed, adaptable explanations across different robots, tasks, and environments. We fine-tuned AHA using FailGen, a scalable framework that generates the first large-scale dataset of robotic failure trajectories, the AHA dataset. FailGen achieves this by procedurally perturbing successful demonstrations from simulation. Despite being trained solely on the AHA dataset, AHA generalizes effectively to real-world failure datasets, robotic systems, and unseen tasks. It surpasses the second-best model (GPT-4o in-context learning) by 10.3% and exceeds the average performance of six compared models including five state-of-the-art VLMs by 35.3% across multiple metrics and datasets. We integrate AHA into three manipulation frameworks that utilize LLMs/VLMs for reinforcement learning, task and motion planning, and zero-shot trajectory generation. AHA's failure feedback enhances these policies' performances by refining dense reward functions, optimizing task planning, and improving sub-task verification, boosting task success rates by an average of 21.4% across all three tasks compared to GPT-4 models.
comment: Appendix and details can be found in project website: https://aha-vlm.github.io/
AARK: An Open Toolkit for Autonomous Racing Research
Autonomous racing demands safe control of vehicles at their physical limits for extended periods of time, providing insights into advanced vehicle safety systems which increasingly rely on intervention provided by vehicle autonomy. Participation in this field carries with it a high barrier to entry. Physical platforms and their associated sensor suites require large capital outlays before any demonstrable progress can be made. Simulators allow researches to develop soft autonomous systems without purchasing a platform. However, currently available simulators lack visual and dynamic fidelity, can still be expensive to buy, lack customisation, and are difficult to use. AARK provides three packages, ACI, ACDG, and ACMPC. These packages enable research into autonomous control systems in the demanding environment of racing to bring more people into the field and improve reproducibility: ACI provides researchers with a computer vision-friendly interface to Assetto Corsa for convenient comparison and evaluation of autonomous control solutions; ACDG enables generation of depth, normal and semantic segmentation data for training computer vision models to use in perception systems; and ACMPC gives newcomers to the field a modular full-stack autonomous control solution, capable of controlling vehicles to build from. AARK aims to unify and democratise research into a field critical to providing safer roads and trusted autonomous systems.
comment: 7 pages, 5 figures
A Digital Twin Framework for Physical-Virtual Integration in V2X-Enabled Connected Vehicle Corridors
Transportation Cyber-Physical Systems (T-CPS) are critical in improving traffic safety, reliability, and sustainability by integrating computing, communication, and control in transportation systems. The connected vehicle corridor is at the forefront of this transformation, where Cellular Vehicle-to-Everything (C-V2X) technology facilitates real-time data exchange between infrastructure, vehicles, and road users. However, challenges remain in processing and synchronizing the vast V2X data from vehicles and roadside units, particularly when ensuring scalability, data integrity, and operational resilience. This paper presents a digital twin framework for T-CPS, developed from a real-world connected vehicle corridor to address these challenges. By leveraging C-V2X technology and real-time data from infrastructure, vehicles, and road users, the digital twin accurately replicates vehicle behaviors, signal phases, and traffic patterns within the CARLA simulation environment. This framework demonstrates high fidelity between physical and digital systems and ensures robust synchronization of vehicle trajectories and signal phases through extensive experiments. Moreover, the digital twin's scalable and redundant architecture enhances data integrity, making it capable of supporting future large-scale C-V2X deployments. The digital twin is a vital tool in T-CPS, enabling real-time traffic monitoring, prediction, and optimization to enhance the reliability and safety of transportation systems.
Data Augmentation for 3DMM-based Arousal-Valence Prediction for HRI
Humans use multiple communication channels to interact with each other. For instance, body gestures or facial expressions are commonly used to convey an intent. The use of such non-verbal cues has motivated the development of prediction models. One such approach is predicting arousal and valence (AV) from facial expressions. However, making these models accurate for human-robot interaction (HRI) settings is challenging as it requires handling multiple subjects, challenging conditions, and a wide range of facial expressions. In this paper, we propose a data augmentation (DA) technique to improve the performance of AV predictors using 3D morphable models (3DMM). We then utilize this approach in an HRI setting with a mediator robot and a group of three humans. Our augmentation method creates synthetic sequences for underrepresented values in the AV space of the SEWA dataset, which is the most comprehensive dataset with continuous AV labels. Results show that using our DA method improves the accuracy and robustness of AV prediction in real-time applications. The accuracy of our models on the SEWA dataset is 0.793 for arousal and valence.
RRT-CBF Based Motion Planning
Control barrier functions (CBF) are widely explored to enforce the safety-critical constraints on nonlinear systems recently. There are many researchers incorporating the control barrier functions into path planning algorithms to find a safe path, but these methods involve huge computational complexity or unidirectional randomness, resulting in arising of run-time. When safety constraints are satisfied, searching efficiency, and searching space are sacrificed. This paper combines the novel motion planning approach using rapid exploring random trees (RRT) algorithm with model predictive control (MPC) to enforce the CBF with dynamically updating constraints to get the safety-critical resolution of trajectory which will enable the robots not to collide with both static and dynamic circle obstacles as well as other moving robots while considering the model uncertainty in process. Besides, this paper first realizes application of CBF-RRT in robot arm model for nonlinear system.
comment: 20 pages, 25 figures
Bayesian Intention for Enhanced Human Robot Collaboration
Predicting human intent is challenging yet essential to achieving seamless Human-Robot Collaboration (HRC). Many existing approaches fail to fully exploit the inherent relationships between objects, tasks, and the human model. Current methods for predicting human intent, such as Gaussian Mixture Models (GMMs) and Conditional Random Fields (CRFs), often lack interpretability due to their failure to account for causal relationships between variables. To address these challenges, in this paper, we developed a novel Bayesian Intention (BI) framework to predict human intent within a multi-modality information framework in HRC scenarios. This framework captures the complexity of intent prediction by modeling the correlations between human behavior conventions and scene data. Our framework leverages these inferred intent predictions to optimize the robot's response in real-time, enabling smoother and more intuitive collaboration. We demonstrate the effectiveness of our approach through a HRC task involving a UR5 robot, highlighting BI's capability for real-time human intent prediction and collision avoidance using a unique dataset we created. Our evaluations show that the multi-modality BI model predicts human intent within 2.69ms, with a 36% increase in precision, a 60% increase in F1 Score, and an 85% increase in accuracy compared to its best baseline method. The results underscore BI's potential to advance real-time human intent prediction and collision avoidance, making a significant contribution to the field of HRC.
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability
Recent advancements in Large Language Models (LLMs) have showcased their ability to perform complex reasoning tasks, but their effectiveness in planning remains underexplored. In this study, we evaluate the planning capabilities of OpenAI's o1 models across a variety of benchmark tasks, focusing on three key aspects: feasibility, optimality, and generalizability. Through empirical evaluations on constraint-heavy tasks (e.g., $\textit{Barman}$, $\textit{Tyreworld}$) and spatially complex environments (e.g., $\textit{Termes}$, $\textit{Floortile}$), we highlight o1-preview's strengths in self-evaluation and constraint-following, while also identifying bottlenecks in decision-making and memory management, particularly in tasks requiring robust spatial reasoning. Our results reveal that o1-preview outperforms GPT-4 in adhering to task constraints and managing state transitions in structured environments. However, the model often generates suboptimal solutions with redundant actions and struggles to generalize effectively in spatially complex tasks. This pilot study provides foundational insights into the planning limitations of LLMs, offering key directions for future research on improving memory management, decision-making, and generalization in LLM-based planning.
comment: Updated link to code repository
iWalker: Imperative Visual Planning for Walking Humanoid Robot
Humanoid robots, with the potential to perform a broad range of tasks in environments designed for humans, have been deemed crucial for the basis of general AI agents. When talking about planning and controlling, although traditional models and task-specific methods have been extensively studied over the past few decades, they are inadequate for achieving the flexibility and versatility needed for general autonomy. Learning approaches, especially reinforcement learning, are powerful and popular nowadays, but they are inherently "blind" during training, relying heavily on trials in simulation without proper guidance from physical principles or underlying dynamics. In response, we propose a novel end-to-end pipeline that seamlessly integrates perception, planning, and model-based control for humanoid robot walking. We refer to our method as iWalker, which is driven by imperative learning (IL), a self-supervising neuro-symbolic learning framework. This enables the robot to learn from arbitrary unlabeled data, significantly improving its adaptability and generalization capabilities. In experiments, iWalker demonstrates effectiveness in both simulated and real-world environments, representing a significant advancement toward versatile and autonomous humanoid robots.
Learn With Imagination: Safe Set Guided State-wise Constrained Policy Optimization
Deep reinforcement learning (RL) excels in various control tasks, yet the absence of safety guarantees hampers its real-world applicability. In particular, explorations during learning usually results in safety violations, while the RL agent learns from those mistakes. On the other hand, safe control techniques ensure persistent safety satisfaction but demand strong priors on system dynamics, which is usually hard to obtain in practice. To address these problems, we present Safe Set Guided State-wise Constrained Policy Optimization (S-3PO), a pioneering algorithm generating state-wise safe optimal policies with zero training violations, i.e., learning without mistakes. S-3PO first employs a safety-oriented monitor with black-box dynamics to ensure safe exploration. It then enforces an "imaginary" cost for the RL agent to converge to optimal behaviors within safety constraints. S-3PO outperforms existing methods in high-dimensional robotics tasks, managing state-wise constraints with zero training violation. This innovation marks a significant stride towards real-world safe RL deployment.
Redefining Data Pairing for Motion Retargeting Leveraging a Human Body Prior IROS 2024
We propose MR HuBo(Motion Retargeting leveraging a HUman BOdy prior), a cost-effective and convenient method to collect high-quality upper body paired pose data, which is essential for data-driven motion retargeting methods. Unlike existing approaches which collect pose data by converting human MoCap poses into robot poses, our method goes in reverse. We first sample diverse random robot poses, and then convert them into human poses. However, since random robot poses can result in extreme and infeasible human poses, we propose an additional technique to sort out extreme poses by exploiting a human body prior trained from a large amount of human pose data. Our data collection method can be used for any humanoid robots, if one designs or optimizes the system's hyperparameters which include a size scale factor and the joint angle ranges for sampling. In addition to this data collection method, we also present a two-stage motion retargeting neural network that can be trained via supervised learning on a large amount of paired data. Compared to other learning-based methods trained via unsupervised learning, we found that our deep neural network trained with ample high-quality paired data achieved notable performance. Our experiments also show that our data filtering method yields better retargeting results than training the model with raw and noisy data. Our code and video results are available on https://sites.google.com/view/mr-hubo/
comment: 8 pages, 5 Figures, Accepted at IROS 2024
Camera Height Doesn't Change: Unsupervised Training for Metric Monocular Road-Scene Depth Estimation ECCV 2024
In this paper, we introduce a novel training method for making any monocular depth network learn absolute scale and estimate metric road-scene depth just from regular training data, i.e., driving videos. We refer to this training framework as FUMET. The key idea is to leverage cars found on the road as sources of scale supervision and to incorporate them in network training robustly. FUMET detects and estimates the sizes of cars in a frame and aggregates scale information extracted from them into an estimate of the camera height whose consistency across the entire video sequence is enforced as scale supervision. This realizes robust unsupervised training of any, otherwise scale-oblivious, monocular depth network so that they become not only scale-aware but also metric-accurate without the need for auxiliary sensors and extra supervision. Extensive experiments on the KITTI and the Cityscapes datasets show the effectiveness of FUMET, which achieves state-of-the-art accuracy. We also show that FUMET enables training on mixed datasets of different camera heights, which leads to larger-scale training and better generalization. Metric depth reconstruction is essential in any road-scene visual modeling, and FUMET democratizes its deployment by establishing the means to convert any model into a metric depth estimator.
comment: ECCV 2024. Project page: https://vision.ist.i.kyoto-u.ac.jp/research/fumet/
Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning
Autonomous robots are being employed in several mapping and data collection tasks due to their efficiency and low labor costs. In these tasks, the robots are required to map targets-of-interest in an unknown environment while constrained to a given resource budget such as path length or mission time. This is a challenging problem as each robot has to not only detect and avoid collisions from static obstacles in the environment but also has to model other robots' trajectories to avoid inter-robot collisions. We propose a novel deep reinforcement learning approach for multi-robot informative path planning to map targets-of-interest in an unknown 3D environment. A key aspect of our approach is an augmented graph that models other robots' trajectories to enable planning for communication and inter-robot collision avoidance. We train our decentralized reinforcement learning policy via the centralized training and decentralized execution paradigm. Once trained, our policy is also scalable to varying number of robots and does not require re-training. Our approach outperforms other state-of-the-art multi-robot target mapping approaches by 33.75% in terms of the number of discovered targets-of-interest. We open-source our code and model at: https://github.com/AccGen99/marl_ipp
Short vs. Long-term Coordination of Drones: When Distributed Optimization Meets Deep Reinforcement Learning
Swarms of autonomous interactive drones can provide compelling sensing capabilities in Smart City applications, such as traffic monitoring. This paper focuses on the task assignment problem for large-scale spatio-temporal sensing by a drone swarm. However, existing approaches have distinct challenges: distributed evolutionary optimization, such as collective learning, lacks long-term adaptability in dynamic environments, while deep reinforcement learning (DRL) is limited to scale effectively due to the curse of dimensionality. Therefore, this paper proposes a novel synergetic optimization approach by integrating long-term DRL and short-term collective learning. Through this approach, each drone independently and proactively determines its flying direction and recharging location using DRL, while evolving their navigation and sensing policies through collective learning based on a structured tree communication model. Extensive experiments with datasets generated from realistic urban mobility demonstrate an outstanding performance of the proposed solution in complex scenarios. New insights show that this approach provides a win-win synthesis of short-term and long-term strategies for drone-based traffic monitoring, with short-term methods addressing training complexity and energy management, while long-term methods preserving high sensing performance.
Observe Then Act: Asynchronous Active Vision-Action Model for Robotic Manipulation
In real-world scenarios, many robotic manipulation tasks are hindered by occlusions and limited fields of view, posing significant challenges for passive observation-based models that rely on fixed or wrist-mounted cameras. In this paper, we investigate the problem of robotic manipulation under limited visual observation and propose a task-driven asynchronous active vision-action model.Our model serially connects a camera Next-Best-View (NBV) policy with a gripper Next-Best Pose (NBP) policy, and trains them in a sensor-motor coordination framework using few-shot reinforcement learning. This approach allows the agent to adjust a third-person camera to actively observe the environment based on the task goal, and subsequently infer the appropriate manipulation actions.We trained and evaluated our model on 8 viewpoint-constrained tasks in RLBench. The results demonstrate that our model consistently outperforms baseline algorithms, showcasing its effectiveness in handling visual constraints in manipulation tasks.
HortiBot: An Adaptive Multi-Arm System for Robotic Horticulture of Sweet Peppers IROS
Horticultural tasks such as pruning and selective harvesting are labor intensive and horticultural staff are hard to find. Automating these tasks is challenging due to the semi-structured greenhouse workspaces, changing environmental conditions such as lighting, dense plant growth with many occlusions, and the need for gentle manipulation of non-rigid plant organs. In this work, we present the three-armed system HortiBot, with two arms for manipulation and a third arm as an articulated head for active perception using stereo cameras. Its perception system detects not only peppers, but also peduncles and stems in real time, and performs online data association to build a world model of pepper plants. Collision-aware online trajectory generation allows all three arms to safely track their respective targets for observation, grasping, and cutting. We integrated perception and manipulation to perform selective harvesting of peppers and evaluated the system in lab experiments. Using active perception coupled with end-effector force torque sensing for compliant manipulation, HortiBot achieves high success rates in our indoor pepper plant mock-up.
comment: Accepted for International Conference on Intelligent Robots and Systems (IROS) 2024. C. Lenz and R. Menon contributed equally
HOLA-Drone: Hypergraphic Open-ended Learning for Zero-Shot Multi-Drone Cooperative Pursuit
Zero-shot coordination (ZSC) is a significant challenge in multi-agent collaboration, aiming to develop agents that can coordinate with unseen partners they have not encountered before. Recent cutting-edge ZSC methods have primarily focused on two-player video games such as OverCooked!2 and Hanabi. In this paper, we extend the scope of ZSC research to the multi-drone cooperative pursuit scenario, exploring how to construct a drone agent capable of coordinating with multiple unseen partners to capture multiple evaders. We propose a novel Hypergraphic Open-ended Learning Algorithm (HOLA-Drone) that continuously adapts the learning objective based on our hypergraphic-form game modeling, aiming to improve cooperative abilities with multiple unknown drone teammates. To empirically verify the effectiveness of HOLA-Drone, we build two different unseen drone teammate pools to evaluate their performance in coordination with various unseen partners. The experimental results demonstrate that HOLA-Drone outperforms the baseline methods in coordination with unseen drone teammates. Furthermore, real-world experiments validate the feasibility of HOLA-Drone in physical systems. Videos can be found on the project homepage~\url{https://sites.google.com/view/hola-drone}.
comment: 10 pages
Whale Detection Enhancement through Synthetic Satellite Images
With a number of marine populations in rapid decline, collecting and analyzing data about marine populations has become increasingly important to develop effective conservation policies for a wide range of marine animals, including whales. Modern computer vision algorithms allow us to detect whales in images in a wide range of domains, further speeding up and enhancing the monitoring process. However, these algorithms heavily rely on large training datasets, which are challenging and time-consuming to collect particularly in marine or aquatic environments. Recent advances in AI however have made it possible to synthetically create datasets for training machine learning algorithms, thus enabling new solutions that were not possible before. In this work, we present a solution - SeaDroneSim2 benchmark suite, which addresses this challenge by generating aerial, and satellite synthetic image datasets to improve the detection of whales and reduce the effort required for training data collection. We show that we can achieve a 15% performance boost on whale detection compared to using the real data alone for training, by augmenting a 10% real data. We open source both the code of the simulation platform SeaDroneSim2 and the dataset generated through it.
PointNetPGAP-SLC: A 3D LiDAR-based Place Recognition Approach with Segment-level Consistency Training for Mobile Robots in Horticulture
3D LiDAR-based place recognition remains largely underexplored in horticultural environments, which present unique challenges due to their semi-permeable nature to laser beams. This characteristic often results in highly similar LiDAR scans from adjacent rows, leading to descriptor ambiguity and, consequently, compromised retrieval performance. In this work, we address the challenges of 3D LiDAR place recognition in horticultural environments, particularly focusing on inter-row ambiguity by introducing three key contributions: (i) a novel model, PointNetPGAP, which combines the outputs of two statistically-inspired aggregators into a single descriptor; (ii) a Segment-Level Consistency (SLC) model, used exclusively during training to enhance descriptor robustness; and (iii) the HORTO-3DLM dataset, comprising LiDAR sequences from orchards and strawberry fields. Experimental evaluations conducted on the HORTO-3DLM and KITTI Odometry datasets demonstrate that PointNetPGAP outperforms state-of-the-art models, including OverlapTransformer and PointNetVLAD, particularly when the SLC model is applied. These results underscore the model's superiority, especially in horticultural environments, by significantly improving retrieval performance in segments with higher ambiguity.
comment: This preprint has been accepted for publication in IEEE Robotics and Automation Letters
FlightBench: Benchmarking Learning-based Methods for Ego-vision-based Quadrotors Navigation
Ego-vision-based navigation in cluttered environments is crucial for mobile systems, particularly agile quadrotors. While learning-based methods have shown promise recently, head-to-head comparisons with cutting-edge optimization-based approaches are scarce, leaving open the question of where and to what extent they truly excel. In this paper, we introduce FlightBench, the first comprehensive benchmark that implements various learning-based methods for ego-vision-based navigation and evaluates them against mainstream optimization-based baselines using a broad set of performance metrics. Additionally, we develop a suite of criteria to assess scenario difficulty and design test cases that span different levels of difficulty based on these criteria. Our results show that while learning-based methods excel in high-speed flight and faster inference, they struggle with challenging scenarios like sharp corners or view occlusion. Analytical experiments validate the correlation between our difficulty criteria and flight performance. We hope this benchmark and these criteria will drive future advancements in learning-based navigation for ego-vision quadrotors. The source code and documentation is available at \url{https://github.com/thu-uav/FlightBench}.
comment: The first three authors contribute equally
Kinodynamic Motion Planning for a Team of Multirotors Transporting a Cable-Suspended Payload in Cluttered Environments IROS
We propose a motion planner for cable-driven payload transportation using multiple unmanned aerial vehicles (UAVs) in an environment cluttered with obstacles. Our planner is kinodynamic, i.e., it considers the full dynamics model of the transporting system including actuation constraints. Due to the high dimensionality of the planning problem, we use a hierarchical approach where we first solve the geometric motion planning using a sampling-based method with a novel sampler, followed by constrained trajectory optimization that considers the full dynamics of the system. Both planning stages consider inter-robot and robot/obstacle collisions. We demonstrate in a software-in-the-loop simulation and real flight experiments that there is a significant benefit in kinodynamic motion planning for such payload transport systems with respect to payload tracking error and energy consumption compared to the standard methods of planning for the payload alone. Notably, we observe a significantly higher success rate in scenarios where the team formation changes are needed to move through tight spaces.
comment: Accepted by IROS, 2024
Multi-Agent Obstacle Avoidance using Velocity Obstacles and Control Barrier Functions
Velocity Obstacles (VO) methods form a paradigm for collision avoidance strategies among moving obstacles and agents. While VO methods perform well in simple multi-agent environments, they don't guarantee safety and can show overly conservative behavior in common situations. In this paper, we propose to combine a VO-strategy for guidance with a CBF-approach for safety, which overcomes the overly conservative behavior of VOs and formally guarantees safety. We validate our method in a baseline comparison study, using 2nd order integrator and car-like dynamics. Results support that our method outperforms the baselines w.r.t. path smoothness, collision avoidance, and success rates.
PUMA: Deep Metric Imitation Learning for Stable Motion Primitives
Imitation Learning (IL) is a powerful technique for intuitive robotic programming. However, ensuring the reliability of learned behaviors remains a challenge. In the context of reaching motions, a robot should consistently reach its goal, regardless of its initial conditions. To meet this requirement, IL methods often employ specialized function approximators that guarantee this property by construction. Although effective, these approaches come with a set of limitations: 1) they are unable to fully exploit the capabilities of modern Deep Neural Network (DNN) architectures, 2) some are restricted in the family of motions they can model, resulting in suboptimal IL capabilities, and 3) they require explicit extensions to account for the geometry of motions that consider orientations. To address these challenges, we introduce a novel stability loss function, drawing inspiration from the triplet loss used in the deep metric learning literature. This loss does not constrain the DNN's architecture and enables learning policies that yield accurate results. Furthermore, it is not restricted to a specific state space geometry; therefore, it can easily incorporate the geometry of the robot's state space. We provide a proof of the stability properties induced by this loss and empirically validate our method in various settings. These settings include Euclidean and non-Euclidean state spaces, as well as first-order and second-order motions, both in simulation and with real robots. More details about the experimental results can be found in: https://youtu.be/ZWKLGntCI6w.
comment: 21 pages, 15 figures, 4 tables
Velocity Driven Vision: Asynchronous Sensor Fusion Birds Eye View Models for Autonomous Vehicles
Fusing different sensor modalities can be a difficult task, particularly if they are asynchronous. Asynchronisation may arise due to long processing times or improper synchronisation during calibration, and there must exist a way to still utilise this previous information for the purpose of safe driving, and object detection in ego vehicle/ multi-agent trajectory prediction. Difficulties arise in the fact that the sensor modalities have captured information at different times and also at different positions in space. Therefore, they are not spatially nor temporally aligned. This paper will investigate the challenge of radar and LiDAR sensors being asynchronous relative to the camera sensors, for various time latencies. The spatial alignment will be resolved before lifting into BEV space via the transformation of the radar/LiDAR point clouds into the new ego frame coordinate system. Only after this can we concatenate the radar/LiDAR point cloud and lifted camera features. Temporal alignment will be remedied for radar data only, we will implement a novel method of inferring the future radar point positions using the velocity information. Our approach to resolving the issue of sensor asynchrony yields promising results. We demonstrate velocity information can drastically improve IoU for asynchronous datasets, as for a time latency of 360 milliseconds (ms), IoU improves from 49.54 to 53.63. Additionally, for a time latency of 550ms, the camera+radar (C+R) model outperforms the camera+LiDAR (C+L) model by 0.18 IoU. This is an advancement in utilising the often-neglected radar sensor modality, which is less favoured than LiDAR for autonomous driving purposes.
comment: This paper is a preprint of a paper submitted to the 26th Irish Machine Vision and Image Processing Conference (IMVIP 2024). If accepted, the copy of record will be available at IET Digital Library
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Building general-purpose robots that operate seamlessly in any environment, with any object, and utilizing various skills to complete diverse tasks has been a long-standing goal in Artificial Intelligence. However, as a community, we have been constraining most robotic systems by designing them for specific tasks, training them on specific datasets, and deploying them within specific environments. These systems require extensively-labeled data and task-specific models. When deployed in real-world scenarios, such systems face several generalization issues and struggle to remain robust to distribution shifts. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i.e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of general-purpose robotics, and also exploring (ii) what a robotics-specific foundation model would look like. We begin by providing a generalized formulation of how foundation models are used in robotics, and the fundamental barriers to making generalist robots universally applicable. Next, we establish a taxonomy to discuss current work exploring ways to leverage existing foundation models for robotics and develop ones catered to robotics. Finally, we discuss key challenges and promising future directions in using foundation models for enabling general-purpose robotic systems. We encourage readers to view our living GitHub repository 2 of resources, including papers reviewed in this survey, as well as related projects and repositories for developing foundation models for robotics.
DROP: Dexterous Reorientation via Online Planning ICRA 2025
Achieving human-like dexterity is a longstanding challenge in robotics, in part due to the complexity of planning and control for contact-rich systems. In reinforcement learning (RL), one popular approach has been to use massively-parallelized, domain-randomized simulations to learn a policy offline over a vast array of contact conditions, allowing robust sim-to-real transfer. Inspired by recent advances in real-time parallel simulation, this work considers instead the viability of online planning methods for contact-rich manipulation by studying the well-known in-hand cube reorientation task. We propose a simple architecture that employs a sampling-based predictive controller and vision-based pose estimator to search for contact-rich control actions online. We conduct thorough experiments to assess the real-world performance of our method, architectural design choices, and key factors for robustness, demonstrating that our simple sampling-based approach achieves performance comparable to prior RL-based works. Supplemental material: https://caltech-amber.github.io/drop.
comment: Extended version. Submitted to ICRA 2025
Get It For Free: Radar Segmentation without Expert Labels and Its Application in Odometry and Localization
This paper presents a novel weakly supervised semantic segmentation method for radar segmentation, where the existing LiDAR semantic segmentation models are employed to generate semantic labels, which then serve as supervision signals for training a radar semantic segmentation model. The obtained radar semantic segmentation model outperforms LiDAR-based models, providing more consistent and robust segmentation under all-weather conditions, particularly in the snow, rain and fog. To mitigate potential errors in LiDAR semantic labels, we design a dedicated refinement scheme that corrects erroneous labels based on structural features and distribution patterns. The semantic information generated by our radar segmentation model is used in two downstream tasks, achieving significant performance improvements. In large-scale radar-based localization using OpenStreetMap, it leads to localization error reduction by 20.55\% over prior methods. For the odometry task, it improves translation accuracy by 16.4\% compared to the second-best method, securing the first place in the radar odometry competition at the Radar in Robotics workshop of ICRA 2024, Japan
comment: After further review, I have identified a significant error in the paper that affects the validity of the results and conclusions presented. Unfortunately, this error cannot be adequately addressed through a simple revision, and I believe it is in the best interest of the academic community to withdraw the paper
Human-Robot Co-Transportation with Human Uncertainty-Aware MPC and Pose Optimization
This paper proposes a new control algorithm for human-robot co-transportation based on a robot manipulator equipped with a mobile base and a robotic arm. The primary focus is to adapt to human uncertainties through the robot's whole-body kinematics and pose optimization. We introduce an augmented Model Predictive Control (MPC) formulation that explicitly models human uncertainties and contains extra variables than regular MPC to optimize the pose of the robotic arm. The core of our methodology involves a two-step iterative design: At each planning horizon, we select the best pose of the robotic arm (joint angle combination) from a candidate set, aiming to achieve the lowest estimated control cost. This selection is based on solving an uncertainty-aware Discrete Algebraic Ricatti Equation (DARE), which also informs the optimal control inputs for both the mobile base and the robotic arm. To validate the effectiveness of the proposed approach, we provide theoretical derivation for the uncertainty-aware DARE and perform simulated and hardware experiments using a Fetch robot under varying conditions, including different trajectories and noise levels. The results reveal that our proposed approach outperforms baseline algorithms.
comment: 8 pages, 6 figures
OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity
3D semantic occupancy prediction networks have demonstrated remarkable capabilities in reconstructing the geometric and semantic structure of 3D scenes, providing crucial information for robot navigation and autonomous driving systems. However, due to their large overhead from dense network structure designs, existing networks face challenges balancing accuracy and latency. In this paper, we introduce OccRWKV, an efficient semantic occupancy network inspired by Receptance Weighted Key Value (RWKV). OccRWKV separates semantics, occupancy prediction, and feature fusion into distinct branches, each incorporating Sem-RWKV and Geo-RWKV blocks. These blocks are designed to capture long-range dependencies, enabling the network to learn domain-specific representation (i.e., semantics and geometry), which enhances prediction accuracy. Leveraging the sparse nature of real-world 3D occupancy, we reduce computational overhead by projecting features into the bird's-eye view (BEV) space and propose a BEV-RWKV block for efficient feature enhancement and fusion. This enables real-time inference at 22.2 FPS without compromising performance. Experiments demonstrate that OccRWKV outperforms the state-of-the-art methods on the SemanticKITTI dataset, achieving a mIoU of 25.1 while being 20 times faster than the best baseline, Co-Occ, making it suitable for real-time deployment on robots to enhance autonomous navigation efficiency. Code and video are available on our project page: https://jmwang0117.github.io/OccRWKV/.
Approximate Sequential Optimization for Informative Path Planning
We consider the problem of finding an informative path through a graph, given initial and terminal nodes and a given maximum path length. We assume that a linear noise corrupted measurement is taken at each node of an underlying unknown vector that we wish to estimate. The informativeness is measured by the reduction in uncertainty in our estimate, evaluated using several metrics. We present a convex relaxation for this informative path planning problem, which we can readily solve to obtain a bound on the possible performance. We develop an approximate sequential method where the path is constructed segment by segment through dynamic programming. This involves solving an orienteering problem, with the node reward acting as a surrogate for informativeness, taking the first step, and then repeating the process. The method scales to very large problem instances and achieves performance not too far from the bound produced by the convex relaxation. We also demonstrate our method's ability to handle adaptive objectives, multimodal sensing, and multi-agent variations of the informative path planning problem.
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation
There is no limit to how much a robot might explore and learn, but all of that knowledge needs to be searchable and actionable. Within language research, retrieval augmented generation (RAG) has become the workhouse of large-scale non-parametric knowledge, however existing techniques do not directly transfer to the embodied domain, which is multimodal, data is highly correlated, and perception requires abstraction. To address these challenges, we introduce Embodied-RAG, a framework that enhances the foundational model of an embodied agent with a non-parametric memory system capable of autonomously constructing hierarchical knowledge for both navigation and language generation. Embodied-RAG handles a full range of spatial and semantic resolutions across diverse environments and query types, whether for a specific object or a holistic description of ambiance. At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail. This hierarchical organization allows the system to efficiently generate context-sensitive outputs across different robotic platforms. We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 200 explanation and navigation queries across 19 environments, highlighting its promise for general-purpose non-parametric system for embodied agents.
comment: Web: https://quanting-xie.github.io/Embodied-RAG-web/
Reasoning about the Unseen for Efficient Outdoor Object Navigation
Robots should exist anywhere humans do: indoors, outdoors, and even unmapped environments. In contrast, the focus of recent advancements in Object Goal Navigation(OGN) has targeted navigating in indoor environments by leveraging spatial and semantic cues that do not generalize outdoors. While these contributions provide valuable insights into indoor scenarios, the broader spectrum of real-world robotic applications often extends to outdoor settings. As we transition to the vast and complex terrains of outdoor environments, new challenges emerge. Unlike the structured layouts found indoors, outdoor environments lack clear spatial delineations and are riddled with inherent semantic ambiguities. Despite this, humans navigate with ease because we can reason about the unseen. We introduce a new task OUTDOOR, a new mechanism for Large Language Models (LLMs) to accurately hallucinate possible futures, and a new computationally aware success metric for pushing research forward in this more complex domain. Additionally, we show impressive results on both a simulated drone and physical quadruped in outdoor environments. Our agent has no premapping and our formalism outperforms naive LLM-based approaches
comment: 6 pages, 7 figures
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions, such as those encountered in robotic manipulation. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning a dynamical system model with convergence guarantees. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a recurrent neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties for modelling temporal dynamics. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Network (ESN) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
comment: 21 pages, 9 figures
PROSPECT: Precision Robot Spectroscopy Exploration and Characterization Tool IROS 2024
Near Infrared (NIR) spectroscopy is widely used in industrial quality control and automation to test the purity and grade of items. In this research, we propose a novel sensorized end effector and acquisition strategy to capture spectral signatures from objects and register them with a 3D point cloud. Our methodology first takes a 3D scan of an object generated by a time-of-flight depth camera and decomposes the object into a series of planned viewpoints covering the surface. We generate motion plans for a robot manipulator and end-effector to visit these viewpoints while maintaining a fixed distance and surface normal. This process is enabled by the spherical motion of the end-effector and ensures maximal spectral signal quality. By continuously acquiring surface reflectance values as the end-effector scans the target object, the autonomous system develops a four-dimensional model of the target object: position in an $R^3$ coordinate frame, and a reflectance vector denoting the associated spectral signature. We demonstrate this system in building spectral-spatial object profiles of increasingly complex geometries. We show the proposed system and spectral acquisition planning produce more consistent spectral signals than naive point scanning strategies. Our work represents a significant step towards high-resolution spectral-spatial sensor fusion for automated quality assessment.
comment: Presented at IROS 2024
Multiagent Systems
Multi-Agent Obstacle Avoidance using Velocity Obstacles and Control Barrier Functions
Velocity Obstacles (VO) methods form a paradigm for collision avoidance strategies among moving obstacles and agents. While VO methods perform well in simple multi-agent environments, they don't guarantee safety and can show overly conservative behavior in common situations. In this paper, we propose to combine a VO-strategy for guidance with a CBF-approach for safety, which overcomes the overly conservative behavior of VOs and formally guarantees safety. We validate our method in a baseline comparison study, using 2nd order integrator and car-like dynamics. Results support that our method outperforms the baselines w.r.t. path smoothness, collision avoidance, and success rates.
Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach
The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft, operated by multiple operators each having heterogeneous valuations associated with their fleet, between vertiports, while enforcing the arrival, departure, and parking constraints at vertiports. Particularly, we propose an incentive-compatible and individually rational vertiport reservation mechanism that maximizes a social welfare metric, which encapsulates the objective of maximizing the overall valuations of all operators while minimizing the congestion at vertiports. Additionally, we improve the computational tractability of designing the reservation mechanism by proposing a mixed binary linear programming approach that leverages the network flow structure.
comment: 23 pages, 2 figures, 2 tables
Iteration of Thought: Leveraging Inner Dialogue for Autonomous Large Language Model Reasoning
Iterative human engagement is a common and effective means of leveraging the advanced language processing power of large language models (LLMs). Using well-structured prompts in a conversational manner, human users can effectively influence an LLM to develop more thoughtful and accurate responses. Motivated by this insight, we propose the Iteration of Thought (IoT) framework for enhancing LLM responses by generating "thought"-provoking prompts vis a vis an input query and the current iteration of an LLM's response. Unlike static or semi-static approaches, e.g. Chain of Thought (CoT) or Tree of Thoughts (ToT), IoT adapts its reasoning path dynamically, based on evolving context, and without generating alternate explorative thoughts which are ultimately discarded. The three components of the IoT framework are (1) an Inner Dialogue Agent (IDA) responsible for generating instructive, context-specific prompts; (2) an LLM Agent (LLMA) that processes these prompts to refine its responses; and (3) an iterative prompting loop that implements a conversation between the former two components. We introduce two variants of our framework: Autonomous Iteration of Thought (AIoT), where an LLM decides when to stop iterating, and Guided Iteration of Thought (GIoT), which always forces a fixed number iterations. We investigate the performance of IoT across various datasets, spanning complex reasoning tasks from the GPQA dataset, explorative problem-solving in Game of 24, puzzle solving in Mini Crosswords, and multi-hop question answering from the HotpotQA dataset. Our results show that IoT represents a viable paradigm for autonomous response refinement in LLMs, showcasing significant improvements over CoT and thereby enabling more adaptive and efficient reasoning systems that minimize human intervention.
Short vs. Long-term Coordination of Drones: When Distributed Optimization Meets Deep Reinforcement Learning
Swarms of autonomous interactive drones can provide compelling sensing capabilities in Smart City applications, such as traffic monitoring. This paper focuses on the task assignment problem for large-scale spatio-temporal sensing by a drone swarm. However, existing approaches have distinct challenges: distributed evolutionary optimization, such as collective learning, lacks long-term adaptability in dynamic environments, while deep reinforcement learning (DRL) is limited to scale effectively due to the curse of dimensionality. Therefore, this paper proposes a novel synergetic optimization approach by integrating long-term DRL and short-term collective learning. Through this approach, each drone independently and proactively determines its flying direction and recharging location using DRL, while evolving their navigation and sensing policies through collective learning based on a structured tree communication model. Extensive experiments with datasets generated from realistic urban mobility demonstrate an outstanding performance of the proposed solution in complex scenarios. New insights show that this approach provides a win-win synthesis of short-term and long-term strategies for drone-based traffic monitoring, with short-term methods addressing training complexity and energy management, while long-term methods preserving high sensing performance.
Kinodynamic Motion Planning for a Team of Multirotors Transporting a Cable-Suspended Payload in Cluttered Environments IROS
We propose a motion planner for cable-driven payload transportation using multiple unmanned aerial vehicles (UAVs) in an environment cluttered with obstacles. Our planner is kinodynamic, i.e., it considers the full dynamics model of the transporting system including actuation constraints. Due to the high dimensionality of the planning problem, we use a hierarchical approach where we first solve the geometric motion planning using a sampling-based method with a novel sampler, followed by constrained trajectory optimization that considers the full dynamics of the system. Both planning stages consider inter-robot and robot/obstacle collisions. We demonstrate in a software-in-the-loop simulation and real flight experiments that there is a significant benefit in kinodynamic motion planning for such payload transport systems with respect to payload tracking error and energy consumption compared to the standard methods of planning for the payload alone. Notably, we observe a significantly higher success rate in scenarios where the team formation changes are needed to move through tight spaces.
comment: Accepted by IROS, 2024
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Systems and Control (CS)
Sparse Actuation for LPV Systems with Full-State Feedback in $\mathcal{H}_2/\mathcal{H}_\infty$ Framework
This paper addresses the sparse actuation problem for nonlinear systems represented in the Linear Parameter-Varying (LPV) form. We propose a convex optimization framework that concurrently determines actuator magnitude limits and the state-feedback law that guarantees a user-specified closed-loop performance in the $\mathcal{H}_2/\mathcal{H}_\infty$ sense. We also demonstrate that sparse actuation is achieved when the actuator magnitude-limits are minimized in the $l_1$ sense. This is the first paper that addresses this problem for LPV systems. The formulation is demonstrated in a vibration control problem for a flexible wing.
comment: Submitted to American Control Conference 2025
Generative AI Application for Building Industry
This paper investigates the transformative potential of generative AI technologies, particularly large language models (LLMs), within the building industry. By leveraging these advanced AI tools, the study explores their application across key areas such as energy code compliance, building design optimization, and workforce training. The research highlights how LLMs can automate labor-intensive processes, significantly improving efficiency, accuracy, and safety in building practices. The paper also addresses the challenges associated with interpreting complex visual and textual data in architectural plans and regulatory codes, proposing innovative solutions to enhance AI-driven compliance checking and design processes. Additionally, the study considers the broader implications of AI integration, including the development of AI-powered tools for comprehensive code compliance across various regulatory domains and the potential for AI to revolutionize workforce training through realistic simulations. This paper provides a comprehensive analysis of the current capabilities of generative AI in the building industry while outlining future directions for research and development, aiming to pave the way for smarter, more sustainable, and responsive construction practices.
comment: 28 pages, 11 figures, 4 tables
Development of a Platform to Enable Real Time, Non-disruptive Testing and Early Fault Detection of Critical High Voltage Transformers and Switchgears in High Speed-rail
Partial discharge (PD) incidents can occur in critical components of high-speed rail electric systems, such as transformers and switchgears, due to localized insulation defects that cannot withstand electric stress, leading to potential flashovers. These incidents can escalate over time, resulting in breakdowns, downtime, and safety risks. Fortunately, PD activities emit radio frequency (RF) signals, allowing for the development of a hardware platform for real-time, non-invasive PD detection and monitoring. The system uses an RF antenna and high-speed data acquisition to scan signals across a configurable frequency range (100MHz to 3GHz), utilizing intermediate frequency modulation and sliding frequency windows for detailed analysis. When signals exceed a threshold, the system records the events, capturing both raw signal data and spectrum snapshots. Real-time data is streamed to a cloud server, offering remote access through a dedicated smartphone application, enabling maintenance teams to monitor and respond promptly. Laboratory testing has confirmed the system's ability to accurately capture RF signals and provide real-time PD monitoring, enhancing the reliability and safety of high-speed rail infrastructure.
Uncertainty Modelling and Robust Observer Synthesis using the Koopman Operator
This paper proposes a robust nonlinear observer synthesis method for a population of systems modelled using the Koopman operator. The Koopman operator allows nonlinear systems to be rewritten as infinite-dimensional linear systems. A finite-dimensional approximation of the Koopman operator can be identified directly from data, yielding an approximately linear model of a nonlinear system. The proposed observer synthesis method is made possible by this linearity that in turn allows uncertainty within a population of Koopman models to be quantified in the frequency domain. Using this uncertainty model, linear robust control techniques are used to synthesize robust nonlinear Koopman observers. A population of several dozen motor drives is used to experimentally demonstrate the proposed method. Manufacturing variation is characterized in the frequency domain, and a robust Koopman observer is synthesized using mixed $\mathcal{H}_2$-$\mathcal{H}_\infty$ optimal control.
comment: 16 pages, 15 figures
A Unified Approach for Optimal Cruise Airspeed with Variable Cost Index for Fuel-powered and All-electric Aircraft
This paper proposes for the first time a unified optimal approach to solve a direct operating cost (DOC) minimization problem where the cost index (CI) is time-varying. More specifically, the coefficient CI is modeled as a time-varying parameter commanded by Air Traffic Control (ATC). The proposed unified approach relies on the solution of an optimal control problem both for fuel-powered and all-electric aircraft. Furthermore, this paper demonstrates how a variable CI affects the solution of the optimization problem as it presents the equations that allow the computation of optimal constant cruise airspeed and flight time in response to step changes in the CI value. The proposed methodology is validated by a simulated flight scenario. In this scenario the inputs from the ATC are received during flight and the aircraft is required to adjust its optimal airspeed, flight time, and total energy consumption to comply with the operational restrictions imposed by the ATC. The optimal values of airspeed, flight time and energy consumption are computed for both a fuel-powered and an all-electric aircraft, thus enabling applications of the proposed approach to future air mobility all-electric vehicles.
comment: 9 pages, 9 figures
Safe Autonomy for Uncrewed Surface Vehicles Using Adaptive Control and Reachability Analysis
Marine robots must maintain precise control and ensure safety during tasks like ocean monitoring, even when encountering unpredictable disturbances that affect performance. Designing algorithms for uncrewed surface vehicles (USVs) requires accounting for these disturbances to control the vehicle and ensure it avoids obstacles. While adaptive control has addressed USV control challenges, real-world applications are limited, and certifying USV safety amidst unexpected disturbances remains difficult. To tackle control issues, we employ a model reference adaptive controller (MRAC) to stabilize the USV along a desired trajectory. For safety certification, we developed a reachability module with a moving horizon estimator (MHE) to estimate disturbances affecting the USV. This estimate is propagated through a forward reachable set calculation, predicting future states and enabling real-time safety certification. We tested our safe autonomy pipeline on a Clearpath Heron USV in the Charles River, near MIT. Our experiments demonstrated that the USV's MRAC controller and reachability module could adapt to disturbances like thruster failures and drag forces. The MRAC controller outperformed a PID baseline, showing a 45%-81% reduction in RMSE position error. Additionally, the reachability module provided real-time safety certification, ensuring the USV's safety. We further validated our pipeline's effectiveness in underway replenishment and canal scenarios, simulating relevant marine tasks.
comment: 35 pages, 23 figures, 6 tables
Learning Chaotic Dynamics with Embedded Dissipativity
Chaotic dynamics, commonly seen in weather systems and fluid turbulence, are characterized by their sensitivity to initial conditions, which makes accurate prediction challenging. Despite its sensitivity to initial perturbations, many chaotic systems observe dissipative behaviors and ergodicity. Therefore, recently various approaches have been proposed to develop data-driven models preserving invariant statistics over long horizons. Although these methods have shown empirical success in reducing instances of unbounded trajectory generation, many of the models are still prone to generating unbounded trajectories, leading to invalid statistics evaluation. In this paper, we propose a novel neural network architecture that simultaneously learns a dissipative dynamics emulator that guarantees to generate bounded trajectories and an energy-like function that governs the dissipative behavior. More specifically, by leveraging control-theoretic ideas, we derive algebraic conditions based on the learned energy-like function that ensure asymptotic convergence to an invariant level set. Using these algebraic conditions, our proposed model enforces dissipativity through a ReLU projection layer, which provides formal trajectory boundedness guarantees. Furthermore, the invariant level set provides an outer estimate for the strange attractor, which is known to be very difficult to characterize due to its complex geometry. We demonstrate the capability of our model in producing bounded long-horizon trajectory forecasts and characterizing the attractor for chaotic dynamical systems including Lorenz 96 and a truncated Kuramoto-Sivashinsky equation.
Outage-Constrained Sum Secrecy Rate Maximization for STAR-RIS with Energy-Harvesting Eavesdroppers
This article proposes a novel strategy for enhancing secure wireless communication through the use of a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) in a multiple-input single-output system. In the presence of energy-harvesting eavesdroppers, the study aims to maximize the secrecy rate while adhering to strict energy harvesting constraints. By dynamically manipulating the wireless environment with the STAR-RIS, the research examines the balance between harvested energy and secrecy rate under two key protocols: energy splitting and mode selection. The study addresses both imperfect and perfect channel state information (CSI) and formulates a complex non-convex optimization problem, which is solved using a penalty concave convex procedure combined with an alternating optimization algorithm. The method optimizes beamforming and STAR-RIS transmission and reflection coefficients to achieve a optimal balance between secure communication and energy harvesting constraints. Numerical simulations show that the proposed approach is effective, even with imperfect CSI, and outperforms conventional RIS methods in terms of robust security and energy performance.
comment: 8 pages, 6 figures
Improved Sample Complexity of Imitation Learning for Barrier Model Predictive Control
Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. Improving upon our previous work, we show that barrier MPC achieves theoretically optimal error-to-smoothness tradeoff along some direction. At the core of this theoretical guarantee on smoothness is an improved lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we believe could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.
comment: 36 pages, 3 figures. This work extends our previous result in arXiv:2306.01914, which has been accepted for publication in CDC 2024. An earlier version of this manuscript was submitted as part of DP's Master's thesis
Fast and Reliable $N-k$ Contingency Screening with Input-Convex Neural Networks
Power system operators must ensure that dispatch decisions remain feasible in case of grid outages or contingencies to prevent cascading failures and ensure reliable operation. However, checking the feasibility of all $N - k$ contingencies -- every possible simultaneous failure of $k$ grid components -- is computationally intractable for even small $k$, requiring system operators to resort to heuristic screening methods. Because of the increase in uncertainty and changes in system behaviors, heuristic lists might not include all relevant contingencies, generating false negatives in which unsafe scenarios are misclassified as safe. In this work, we propose to use input-convex neural networks (ICNNs) for contingency screening. We show that ICNN reliability can be determined by solving a convex optimization problem, and by scaling model weights using this problem as a differentiable optimization layer during training, we can learn an ICNN classifier that is both data-driven and has provably guaranteed reliability. Namely, our method can ensure a zero false negative rate. We empirically validate this methodology in a case study on the IEEE 39-bus test network, observing that it yields substantial (10-20x) speedups while having excellent classification accuracy.
comment: 11 pages, 4 figures
Koopman Spectral Analysis from Noisy Measurements based on Bayesian Learning and Kalman Smoothing
Koopman spectral analysis plays a crucial role in understanding and modeling nonlinear dynamical systems as it reveals key system behaviors and long-term dynamics. However, the presence of measurement noise poses a significant challenge to accurately extracting spectral properties. In this work, we propose a robust method for identifying the Koopman operator and extracting its spectral characteristics in noisy environments. To address the impact of noise, our approach tackles an identification problem that accounts for both systematic errors from finite-dimensional approximations and measurement noise in the data. By incorporating Bayesian learning and Kalman smoothing, the method simultaneously identifies the Koopman operator and estimates system states, effectively decoupling these two error sources. The method's efficiency and robustness are demonstrated through extensive experiments, showcasing its accuracy across varying noise levels.
Ultra-low-crosstalk Silicon Switches Driven Thermally and Electrically
Silicon photonic switches are widely considered as a cost-effective solution for addressing the ever-growing data traffic in datacenter networks, as they offer unique advantages such as low power consumption, low latency, small footprint and high bandwidth. Despite extensive research efforts, crosstalk in large-scale photonic circuits still poses a threat to the signal integrity. In this paper, we present two designs of silicon Mach-Zehnder Interferometer (MZI) switches achieving ultra-low-crosstalk, driven thermally and electrically. Each switch fabric is optimized at both the device and circuit level to suppress crosstalk and reduce system complexity. Notably, for the first time to the best of our knowledge, we harness the inherent self-heating effect in a carrier-injection-based MZI switch to create a pair of phase shifters that offer arbitrary phase differences. Such a pair of phase shifters induces matched insertion loss at each arm, thus minimizing crosstalk. Experimentally, an ultra-low crosstalk ratio below -40 dB is demonstrated for both thermo-optic (T-O) and electro-optic (E-O) switches. The T-O switch exhibits an on-chip loss of less than 5 dB with a switching time of 500 microseconds, whereas the E-O switch achieves an on-chip loss as low as 8.5 dB with a switching time of under 100 ns. In addition, data transmission of a 50 Gb/s on-off keying signal is demonstrated with high fidelity on the E-O switch, showing the great potential of the proposed switch designs.
comment: 12 pages, 5 figures
Optimized Excitation Signal Design Employing Receding Horizon Control
A novel excitation signal design strategy based on a receding horizon control inspired optimization is presented. The proposed method is shown to effectively generate space-filling designs within the input space of a nonlinear dynamic process, thereby enabling sophisticated acquisition of information in previously unexplored operational areas. Additionally, the strategy can intensify the exploitation of specific operational areas during information gathering, offering flexibility in meeting application-specific requirements.
comment: Will be published in 34th Workshop Computational Intelligence, Berlin (2024)
Optimized Excitation Signal Tailored to Pertinent Dynamic Process Characteristics
The effectiveness of data-driven techniques significantly relies on the input signal used to generate the training data. Nevertheless, there is a notable gap in research when it comes to designing excitation signals for identifying nonlinear dynamic systems, likely because of the challenges involved. Based on current knowledge, it is crucial for excitation signals to effectively capture the nonlinearity across the entire operational area and to gather insights into the area-specific dynamic process characteristics. The Incremental Dynamic Space-Filling Design (IDS-FID) strategy designs excitation signals to achieve a space-filling distribution across the input space of a nonlinear approximator used in external dynamics modeling, gathering information throughout its operational area. Simultaneously, the approach enables for a heightened focus on either the systems steady-state or transient responses during information acquisition by altering the excitation signals dynamics, facilitating targeted insights into dynamic process characteristics.
comment: Will be published in 4th MECC (2024)
Absolute centrality in a signed Friedkin-Johnsen based model: a graphical characterisation of influence
This paper studies the evolution of opinions governed by a Friedkin Johnsen (FJ) based model in arbitrary network structures with signed interactions. The agents contributing to the opinion formation are characterised as being influential. Initially, the agents are classified as opinion leaders and followers based on network connectivity and the nature of interactions. However, the addition of stubbornness leads to interesting behaviours wherein a non influential agent can now become influential and vice versa. Thereafter, a signal flow graph (SFG) based method is proposed to quantify the influence of an influential agents' opinions. Additionally, it helps illustrate the role played by network topology in shaping the final opinions of the agents. Based on this analysis, the absolute centrality measure is proposed to determine the overall influence of all the agents in the network. Unlike most of the existing measures, it is applicable to any network structure and considers the effect of stubbornness and antagonism. Examples are presented throughout the paper to illustrate and validate these results.
comment: 13 pages
MERIT: Multimodal Wearable Vital Sign Waveform Monitoring
Cardiovascular disease (CVD) is the leading cause of death and premature mortality worldwide, with occupational environments significantly influencing CVD risk, underscoring the need for effective cardiac monitoring and early warning systems. Existing methods of monitoring vital signs require subjects to remain stationary, which is impractical for daily monitoring as individuals are often in motion. To address this limitation, we propose MERIT, a multimodality-based wearable system designed for precise ECG waveform monitoring without movement restrictions. Daily activities, involving frequent arm movements, can significantly affect sensor data and complicate the reconstruction of accurate ECG signals. To mitigate motion impact and enhance ECG signal reconstruction, we introduce a deep independent component analysis (Deep-ICA) module and a multimodal fusion module. We conducted experiments with 15 subjects. Our results, compared with commercial wearable devices and existing methods, demonstrate that MERIT accurately reconstructs ECG waveforms during various office activities, offering a reliable solution for fine-grained cardiac monitoring in dynamic environments.
comment: 9 pages, 10 figures
AARK: An Open Toolkit for Autonomous Racing Research
Autonomous racing demands safe control of vehicles at their physical limits for extended periods of time, providing insights into advanced vehicle safety systems which increasingly rely on intervention provided by vehicle autonomy. Participation in this field carries with it a high barrier to entry. Physical platforms and their associated sensor suites require large capital outlays before any demonstrable progress can be made. Simulators allow researches to develop soft autonomous systems without purchasing a platform. However, currently available simulators lack visual and dynamic fidelity, can still be expensive to buy, lack customisation, and are difficult to use. AARK provides three packages, ACI, ACDG, and ACMPC. These packages enable research into autonomous control systems in the demanding environment of racing to bring more people into the field and improve reproducibility: ACI provides researchers with a computer vision-friendly interface to Assetto Corsa for convenient comparison and evaluation of autonomous control solutions; ACDG enables generation of depth, normal and semantic segmentation data for training computer vision models to use in perception systems; and ACMPC gives newcomers to the field a modular full-stack autonomous control solution, capable of controlling vehicles to build from. AARK aims to unify and democratise research into a field critical to providing safer roads and trusted autonomous systems.
comment: 7 pages, 5 figures
A Digital Twin Framework for Physical-Virtual Integration in V2X-Enabled Connected Vehicle Corridors
Transportation Cyber-Physical Systems (T-CPS) are critical in improving traffic safety, reliability, and sustainability by integrating computing, communication, and control in transportation systems. The connected vehicle corridor is at the forefront of this transformation, where Cellular Vehicle-to-Everything (C-V2X) technology facilitates real-time data exchange between infrastructure, vehicles, and road users. However, challenges remain in processing and synchronizing the vast V2X data from vehicles and roadside units, particularly when ensuring scalability, data integrity, and operational resilience. This paper presents a digital twin framework for T-CPS, developed from a real-world connected vehicle corridor to address these challenges. By leveraging C-V2X technology and real-time data from infrastructure, vehicles, and road users, the digital twin accurately replicates vehicle behaviors, signal phases, and traffic patterns within the CARLA simulation environment. This framework demonstrates high fidelity between physical and digital systems and ensures robust synchronization of vehicle trajectories and signal phases through extensive experiments. Moreover, the digital twin's scalable and redundant architecture enhances data integrity, making it capable of supporting future large-scale C-V2X deployments. The digital twin is a vital tool in T-CPS, enabling real-time traffic monitoring, prediction, and optimization to enhance the reliability and safety of transportation systems.
Interleaved One-Shot SPS Performance under Smart DoS Attacks in C-V2X Networks
This paper evaluates the performance of the one-shot Semi-Persistent Scheduling (SPS) mechanism in Cellular Vehicle-to-Everything (C-V2X) networks under Denial-of-Service (DoS) smart attack scenarios. The study focuses on the impact of these attacks on key performance metrics, including Packet Delivery Ratio (PDR), Inter-Packet Gap (IPG), and Age of Information (AoI). Through extensive Monte Carlo simulations, we demonstrate that the one-shot mechanism significantly enhances network resilience by mitigating the adverse effects of smart DoS attacks. The findings reveal that while the one-shot mechanism improves the PDR and reduces the IPG and AoI tail values, its effectiveness diminishes slightly in high-density vehicular environments. Nevertheless, the one-shot mechanism proves to be a robust solution for maintaining the stability and reliability of C-V2X communications under adversarial conditions.
RRT-CBF Based Motion Planning
Control barrier functions (CBF) are widely explored to enforce the safety-critical constraints on nonlinear systems recently. There are many researchers incorporating the control barrier functions into path planning algorithms to find a safe path, but these methods involve huge computational complexity or unidirectional randomness, resulting in arising of run-time. When safety constraints are satisfied, searching efficiency, and searching space are sacrificed. This paper combines the novel motion planning approach using rapid exploring random trees (RRT) algorithm with model predictive control (MPC) to enforce the CBF with dynamically updating constraints to get the safety-critical resolution of trajectory which will enable the robots not to collide with both static and dynamic circle obstacles as well as other moving robots while considering the model uncertainty in process. Besides, this paper first realizes application of CBF-RRT in robot arm model for nonlinear system.
comment: 20 pages, 25 figures
Energetic Resilience of Linear Driftless Systems
When a malfunction causes a control system to lose authority over a subset of its actuators, achieving a task may require spending additional energy in order to compensate for the effect of uncontrolled inputs. To understand this increase in energy, we introduce energetic resilience metrics that quantify the maximal additional energy required to achieve finite-time regulation in linear driftless systems that lose authority over some of their actuators. Using a technical lemma based on the calculus of variations, we first derive optimal control signals and minimum energies to achieve this task in both the nominal and malfunctioning systems. We then obtain a bound on the worst-case energy used by the malfunctioning system, and its exact expression in the special case of loss of authority over one actuator. Further considering this special case, we derive bounds on additive and multiplicative metrics for energetic resilience. A simulation example on a model of an underwater robot demonstrates that these bounds are useful in quantifying the increased energy used by a system suffering a partial loss of control authority.
comment: 9 pages, 2 figures
Strategic information disclosure with communication constraints and private preferences
Social-media platforms are one of the most prevalent communication media today. In such systems, a large amount of content is generated and available to the platform. However, not all content can be transmitted to every possible user at all times. At the other end are the users, who have their own preferences about which content they enjoy, which is often unknown ex ante to the platform. We model the interaction between the platform and the users as a signaling game with asymmetric information, where each user optimizes its preference disclosure policy, and the platform optimizes its information disclosure policy. We provide structural as well as existence of policies that constitute Bayesian Nash Equilibria, and necessary optimality conditions used to explicitly compute the optimal policies.
comment: Submitted to American Control Conference 2025
iWalker: Imperative Visual Planning for Walking Humanoid Robot
Humanoid robots, with the potential to perform a broad range of tasks in environments designed for humans, have been deemed crucial for the basis of general AI agents. When talking about planning and controlling, although traditional models and task-specific methods have been extensively studied over the past few decades, they are inadequate for achieving the flexibility and versatility needed for general autonomy. Learning approaches, especially reinforcement learning, are powerful and popular nowadays, but they are inherently "blind" during training, relying heavily on trials in simulation without proper guidance from physical principles or underlying dynamics. In response, we propose a novel end-to-end pipeline that seamlessly integrates perception, planning, and model-based control for humanoid robot walking. We refer to our method as iWalker, which is driven by imperative learning (IL), a self-supervising neuro-symbolic learning framework. This enables the robot to learn from arbitrary unlabeled data, significantly improving its adaptability and generalization capabilities. In experiments, iWalker demonstrates effectiveness in both simulated and real-world environments, representing a significant advancement toward versatile and autonomous humanoid robots.
Robust Multivariate Detection and Estimation with Fault Frequency Content Information
This paper studies the problem of fault detection and estimation (FDE) for linear time-invariant (LTI) systems with a particular focus on frequency content information of faults, possibly as multiple disjoint continuum ranges, and under both disturbances and stochastic noise. To ensure the worst-case fault sensitivity in the considered frequency ranges and mitigate the effects of disturbances and noise, an optimization framework incorporating a mixed H_/H2 performance index is developed to compute the optimal detection filter. Moreover, a thresholding rule is proposed to guarantee both the false alarm rate (FAR) and the fault detection rate (FDR). Next, shifting attention to fault estimation in specific frequency ranges, an exact reformulation of the optimal estimation filter design using the restricted Hinf performance index is derived, which is inherently non-convex. However, focusing on finite frequency samples and fixed poles, a lower bound is established via a highly tractable quadratic programming (QP) problem. This lower bound together with an alternating optimization (AO) approach to the original estimation problem leads to a suboptimality gap for the overall estimation filter design. The effectiveness of the proposed approaches is validated through applications of a non-minimum phase hydraulic turbine system and a multi-area power system.
comment: 31pages, 15 figures
Dissipativity-Based Distributed Droop-Free Controller and Communication Topology Co-Design for DC Microgrids
This paper presents a novel dissipativity-based distributed droop-free control approach for the voltage regulation problem in DC microgrids (MGs) comprised of an interconnected set of distributed generators (DGs), loads, and power lines. First, we describe the closed-loop DC MG as a networked system where the sets of DGs and lines (i.e., subsystems) are interconnected via a static interconnection matrix. This interconnection matrix demonstrates how the inputs and outputs of DGs and lines are connected with each other. Each DG has a local controller and a distributed global controller. To design the distributed global controllers, we use the dissipativity properties of the subsystems and formulate a linear matrix inequality (LMI) problem. To support the feasibility of this distributed global controller design, we identify a set of necessary local conditions, which we then enforce in a specifically developed LMI-based local controller design process. In contrast to existing DC MG control solutions that separate distributed controller and communication topology design problems, our approach proposes a unified framework for distributed controller and communication topology co-design. As the co-design process is LMI-based, it can be efficiently implemented and evaluated using existing software tools. The effectiveness of the proposed solution in terms of voltage regulation and current sharing is verified by simulating an islanded DC MG in a MATLAB/Simulink environment under different scenarios, such as load changes and topological constraint changes, and comparing its performance with the recent droop control approach.
Optimal Control on Positive Cones
An optimal control problem on finite-dimensional positive cones is stated. Under a critical assumption on the cone, the corresponding Bellman equation is satisfied by a linear function, which can be computed by convex optimization. A separate theorem relates the assumption on the cone to the existence of minimal elements in certain subsets of the dual cone. Three special cases are derived as examples. The first one, where the positive cone is the set of positive semi-definite matrices, reduces to standard linear quadratic control. The second one, where the positive cone is a polyhedron, reduces to a recent result on optimal control of positive systems. The third special case corresponds to linear quadratic control with additional structure, such as spatial invariance.
comment: 16 pages, to be published in the proceedings for the 2024 Conference on Decision and Control (CDC)
Modeling Fault Recovery and Transient Stability of Grid-Forming Converters Equipped With Current Reference Limitation
When grid-forming (GFM) inverter-based resources (IBRs) face severe grid disturbances (e.g., short-circuit faults), the current limitation mechanism may be triggered. Consequently, the GFM IBRs enter the current-saturation mode, inducing nonlinear dynamical behaviors and posing great challenges to the post-disturbance transient angle stability. This paper presents a systematic study to reveal the fault recovery behaviors of a GFM IBR and identify the risk of instability. A closed-form expression for the necessary condition that a GFM IBR returns from the current-saturation mode to the normal operation mode is presented. Based on these analyses, it is inferred that the angle of the magnitude-saturated current significantly affects the post-fault recovery and transient stability; with different angle selection, the system may follow multiple post-fault trajectories depending on those conditions: 1) Convergence to a normal stable equilibrium point (SEP), 2) convergence to a saturated stable equilibrium point (satSEP), or 3) divergence (instability). In this paper, the circumstances under which a GFM IBR cannot escape from the current-saturation mode are thoroughly investigated. The theoretical analyses are verified by dynamic simulations.
comment: 13 pages, 22 figures
Model Predictive Control for setpoint tracking
The main objective of tracking control is to steer the tracking error, that is the difference between the reference and the output, to zero while the plant's operation limits are satisfied. This requires that some assumptions on the evolution of the future values of the reference must be taken into account. Typically a simple evolution of the reference is considered, such as step, ramp, or parabolic reference signals. It is important to notice that the tracking problem considers possible variations in the reference to be tracked, such as steps or slope variations of the ramps. Then the tracking control problem is inherently uncertain, since the reference may differ from what is expected. If the value of the reference is changed, then there is no guarantee that the feasibility and stability properties of the resulting control law hold. This report presents the MPC for tracking (MPCT) approach, which ensures recursive feasibility and asymptotic stability of the setpoint when the value of the reference is changed.
Prediction-Free Coordinated Dispatch of Microgrid: A Data-Driven Online Optimization Approach
Traditional prediction-dependent dispatch methods can face challenges when renewables and prices predictions are unreliable in microgrid. Instead, this paper proposes a novel prediction-free two-stage coordinated dispatch approach in microgrid. Empirical learning is conducted during the offline stage, where we calculate the offline optimal state of charge (SOC) sequences for generic energy storage under different historical scenarios. During the online stage, we synthesize a dynamically updated reference for SOC and a dynamic opportunity price (DOP) based on empirical learning and real-time observations. They provide a global vision for online operation and effectively address the myopic tendencies inherent to online decision-making. The real-time control action, generated from online optimization algorithm, aims to minimize the operational costs while tracking the reference and considering DOP. Additionally, we develop an adaptive virtual-queue-based online optimization algorithm based on online convex optimization (OCO) framework. We provide theoretical proof that the proposed algorithm outperforms the existing OCO algorithms and achieves sublinear dynamic regret bound and sublinear strict constraint violation bound. Simulation-based studies demonstrate that, compared with model predictive control-based methods, it reduces operational costs and voltage violation rate by 5% and 9%, respectively.
Quantifying the Safety of Trajectories using Peak-Minimizing Control
This work quantifies the safety of trajectories of a dynamical system by the perturbation intensity required to render a system unsafe (crash into the unsafe set). Computation of this measure of safety is posed as a peak-minimizing optimal control problem. Convergent lower bounds on the minimal peak value of controller effort are computed using polynomial optimization and the moment-Sum-of-Squares hierarchy. The crash-safety framework is extended towards data-driven safety analysis by measuring safety as the maximum amount of data corruption required to crash into the unsafe set.
comment: 19 pages, 9 figures, 3 tables
Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach
The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft, operated by multiple operators each having heterogeneous valuations associated with their fleet, between vertiports, while enforcing the arrival, departure, and parking constraints at vertiports. Particularly, we propose an incentive-compatible and individually rational vertiport reservation mechanism that maximizes a social welfare metric, which encapsulates the objective of maximizing the overall valuations of all operators while minimizing the congestion at vertiports. Additionally, we improve the computational tractability of designing the reservation mechanism by proposing a mixed binary linear programming approach that leverages the network flow structure.
comment: 23 pages, 2 figures, 2 tables
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions, such as those encountered in robotic manipulation. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning a dynamical system model with convergence guarantees. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a recurrent neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties for modelling temporal dynamics. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Network (ESN) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
comment: 21 pages, 9 figures
Decentralized Optimization in Time-Varying Networks with Arbitrary Delays
We consider a decentralized optimization problem for networks affected by communication delays. Examples of such networks include collaborative machine learning, sensor networks, and multi-agent systems. To mimic communication delays, we add virtual non-computing nodes to the network, resulting in directed graphs. This motivates investigating decentralized optimization solutions on directed graphs. Existing solutions assume nodes know their out-degrees, resulting in limited applicability. To overcome this limitation, we introduce a novel gossip-based algorithm, called DT-GO, that does not need to know the out-degrees. The algorithm is applicable in general directed networks, for example networks with delays or limited acknowledgment capabilities. We derive convergence rates for both convex and non-convex objectives, showing that our algorithm achieves the same complexity order as centralized Stochastic Gradient Descent. In other words, the effects of the graph topology and delays are confined to higher-order terms. Additionally, we extend our analysis to accommodate time-varying network topologies. Numerical simulations are provided to support our theoretical findings.
comment: arXiv admin note: text overlap with arXiv:2401.11344
Systems and Control (EESS)
Sparse Actuation for LPV Systems with Full-State Feedback in $\mathcal{H}_2/\mathcal{H}_\infty$ Framework
This paper addresses the sparse actuation problem for nonlinear systems represented in the Linear Parameter-Varying (LPV) form. We propose a convex optimization framework that concurrently determines actuator magnitude limits and the state-feedback law that guarantees a user-specified closed-loop performance in the $\mathcal{H}_2/\mathcal{H}_\infty$ sense. We also demonstrate that sparse actuation is achieved when the actuator magnitude-limits are minimized in the $l_1$ sense. This is the first paper that addresses this problem for LPV systems. The formulation is demonstrated in a vibration control problem for a flexible wing.
comment: Submitted to American Control Conference 2025
Generative AI Application for Building Industry
This paper investigates the transformative potential of generative AI technologies, particularly large language models (LLMs), within the building industry. By leveraging these advanced AI tools, the study explores their application across key areas such as energy code compliance, building design optimization, and workforce training. The research highlights how LLMs can automate labor-intensive processes, significantly improving efficiency, accuracy, and safety in building practices. The paper also addresses the challenges associated with interpreting complex visual and textual data in architectural plans and regulatory codes, proposing innovative solutions to enhance AI-driven compliance checking and design processes. Additionally, the study considers the broader implications of AI integration, including the development of AI-powered tools for comprehensive code compliance across various regulatory domains and the potential for AI to revolutionize workforce training through realistic simulations. This paper provides a comprehensive analysis of the current capabilities of generative AI in the building industry while outlining future directions for research and development, aiming to pave the way for smarter, more sustainable, and responsive construction practices.
comment: 28 pages, 11 figures, 4 tables
Development of a Platform to Enable Real Time, Non-disruptive Testing and Early Fault Detection of Critical High Voltage Transformers and Switchgears in High Speed-rail
Partial discharge (PD) incidents can occur in critical components of high-speed rail electric systems, such as transformers and switchgears, due to localized insulation defects that cannot withstand electric stress, leading to potential flashovers. These incidents can escalate over time, resulting in breakdowns, downtime, and safety risks. Fortunately, PD activities emit radio frequency (RF) signals, allowing for the development of a hardware platform for real-time, non-invasive PD detection and monitoring. The system uses an RF antenna and high-speed data acquisition to scan signals across a configurable frequency range (100MHz to 3GHz), utilizing intermediate frequency modulation and sliding frequency windows for detailed analysis. When signals exceed a threshold, the system records the events, capturing both raw signal data and spectrum snapshots. Real-time data is streamed to a cloud server, offering remote access through a dedicated smartphone application, enabling maintenance teams to monitor and respond promptly. Laboratory testing has confirmed the system's ability to accurately capture RF signals and provide real-time PD monitoring, enhancing the reliability and safety of high-speed rail infrastructure.
Uncertainty Modelling and Robust Observer Synthesis using the Koopman Operator
This paper proposes a robust nonlinear observer synthesis method for a population of systems modelled using the Koopman operator. The Koopman operator allows nonlinear systems to be rewritten as infinite-dimensional linear systems. A finite-dimensional approximation of the Koopman operator can be identified directly from data, yielding an approximately linear model of a nonlinear system. The proposed observer synthesis method is made possible by this linearity that in turn allows uncertainty within a population of Koopman models to be quantified in the frequency domain. Using this uncertainty model, linear robust control techniques are used to synthesize robust nonlinear Koopman observers. A population of several dozen motor drives is used to experimentally demonstrate the proposed method. Manufacturing variation is characterized in the frequency domain, and a robust Koopman observer is synthesized using mixed $\mathcal{H}_2$-$\mathcal{H}_\infty$ optimal control.
comment: 16 pages, 15 figures
A Unified Approach for Optimal Cruise Airspeed with Variable Cost Index for Fuel-powered and All-electric Aircraft
This paper proposes for the first time a unified optimal approach to solve a direct operating cost (DOC) minimization problem where the cost index (CI) is time-varying. More specifically, the coefficient CI is modeled as a time-varying parameter commanded by Air Traffic Control (ATC). The proposed unified approach relies on the solution of an optimal control problem both for fuel-powered and all-electric aircraft. Furthermore, this paper demonstrates how a variable CI affects the solution of the optimization problem as it presents the equations that allow the computation of optimal constant cruise airspeed and flight time in response to step changes in the CI value. The proposed methodology is validated by a simulated flight scenario. In this scenario the inputs from the ATC are received during flight and the aircraft is required to adjust its optimal airspeed, flight time, and total energy consumption to comply with the operational restrictions imposed by the ATC. The optimal values of airspeed, flight time and energy consumption are computed for both a fuel-powered and an all-electric aircraft, thus enabling applications of the proposed approach to future air mobility all-electric vehicles.
comment: 9 pages, 9 figures
Safe Autonomy for Uncrewed Surface Vehicles Using Adaptive Control and Reachability Analysis
Marine robots must maintain precise control and ensure safety during tasks like ocean monitoring, even when encountering unpredictable disturbances that affect performance. Designing algorithms for uncrewed surface vehicles (USVs) requires accounting for these disturbances to control the vehicle and ensure it avoids obstacles. While adaptive control has addressed USV control challenges, real-world applications are limited, and certifying USV safety amidst unexpected disturbances remains difficult. To tackle control issues, we employ a model reference adaptive controller (MRAC) to stabilize the USV along a desired trajectory. For safety certification, we developed a reachability module with a moving horizon estimator (MHE) to estimate disturbances affecting the USV. This estimate is propagated through a forward reachable set calculation, predicting future states and enabling real-time safety certification. We tested our safe autonomy pipeline on a Clearpath Heron USV in the Charles River, near MIT. Our experiments demonstrated that the USV's MRAC controller and reachability module could adapt to disturbances like thruster failures and drag forces. The MRAC controller outperformed a PID baseline, showing a 45%-81% reduction in RMSE position error. Additionally, the reachability module provided real-time safety certification, ensuring the USV's safety. We further validated our pipeline's effectiveness in underway replenishment and canal scenarios, simulating relevant marine tasks.
comment: 35 pages, 23 figures, 6 tables
Learning Chaotic Dynamics with Embedded Dissipativity
Chaotic dynamics, commonly seen in weather systems and fluid turbulence, are characterized by their sensitivity to initial conditions, which makes accurate prediction challenging. Despite its sensitivity to initial perturbations, many chaotic systems observe dissipative behaviors and ergodicity. Therefore, recently various approaches have been proposed to develop data-driven models preserving invariant statistics over long horizons. Although these methods have shown empirical success in reducing instances of unbounded trajectory generation, many of the models are still prone to generating unbounded trajectories, leading to invalid statistics evaluation. In this paper, we propose a novel neural network architecture that simultaneously learns a dissipative dynamics emulator that guarantees to generate bounded trajectories and an energy-like function that governs the dissipative behavior. More specifically, by leveraging control-theoretic ideas, we derive algebraic conditions based on the learned energy-like function that ensure asymptotic convergence to an invariant level set. Using these algebraic conditions, our proposed model enforces dissipativity through a ReLU projection layer, which provides formal trajectory boundedness guarantees. Furthermore, the invariant level set provides an outer estimate for the strange attractor, which is known to be very difficult to characterize due to its complex geometry. We demonstrate the capability of our model in producing bounded long-horizon trajectory forecasts and characterizing the attractor for chaotic dynamical systems including Lorenz 96 and a truncated Kuramoto-Sivashinsky equation.
Outage-Constrained Sum Secrecy Rate Maximization for STAR-RIS with Energy-Harvesting Eavesdroppers
This article proposes a novel strategy for enhancing secure wireless communication through the use of a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) in a multiple-input single-output system. In the presence of energy-harvesting eavesdroppers, the study aims to maximize the secrecy rate while adhering to strict energy harvesting constraints. By dynamically manipulating the wireless environment with the STAR-RIS, the research examines the balance between harvested energy and secrecy rate under two key protocols: energy splitting and mode selection. The study addresses both imperfect and perfect channel state information (CSI) and formulates a complex non-convex optimization problem, which is solved using a penalty concave convex procedure combined with an alternating optimization algorithm. The method optimizes beamforming and STAR-RIS transmission and reflection coefficients to achieve a optimal balance between secure communication and energy harvesting constraints. Numerical simulations show that the proposed approach is effective, even with imperfect CSI, and outperforms conventional RIS methods in terms of robust security and energy performance.
comment: 8 pages, 6 figures
Improved Sample Complexity of Imitation Learning for Barrier Model Predictive Control
Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller. However, constructing such smoothed expert controllers for arbitrary systems remains challenging, especially in the presence of input and state constraints. As our primary contribution, we show how such a smoothed expert can be designed for a general class of systems using a log-barrier-based relaxation of a standard Model Predictive Control (MPC) optimization problem. Improving upon our previous work, we show that barrier MPC achieves theoretically optimal error-to-smoothness tradeoff along some direction. At the core of this theoretical guarantee on smoothness is an improved lower bound we prove on the optimality gap of the analytic center associated with a convex Lipschitz function, which we believe could be of independent interest. We validate our theoretical findings via experiments, demonstrating the merits of our smoothing approach over randomized smoothing.
comment: 36 pages, 3 figures. This work extends our previous result in arXiv:2306.01914, which has been accepted for publication in CDC 2024. An earlier version of this manuscript was submitted as part of DP's Master's thesis
Fast and Reliable $N-k$ Contingency Screening with Input-Convex Neural Networks
Power system operators must ensure that dispatch decisions remain feasible in case of grid outages or contingencies to prevent cascading failures and ensure reliable operation. However, checking the feasibility of all $N - k$ contingencies -- every possible simultaneous failure of $k$ grid components -- is computationally intractable for even small $k$, requiring system operators to resort to heuristic screening methods. Because of the increase in uncertainty and changes in system behaviors, heuristic lists might not include all relevant contingencies, generating false negatives in which unsafe scenarios are misclassified as safe. In this work, we propose to use input-convex neural networks (ICNNs) for contingency screening. We show that ICNN reliability can be determined by solving a convex optimization problem, and by scaling model weights using this problem as a differentiable optimization layer during training, we can learn an ICNN classifier that is both data-driven and has provably guaranteed reliability. Namely, our method can ensure a zero false negative rate. We empirically validate this methodology in a case study on the IEEE 39-bus test network, observing that it yields substantial (10-20x) speedups while having excellent classification accuracy.
comment: 11 pages, 4 figures
Koopman Spectral Analysis from Noisy Measurements based on Bayesian Learning and Kalman Smoothing
Koopman spectral analysis plays a crucial role in understanding and modeling nonlinear dynamical systems as it reveals key system behaviors and long-term dynamics. However, the presence of measurement noise poses a significant challenge to accurately extracting spectral properties. In this work, we propose a robust method for identifying the Koopman operator and extracting its spectral characteristics in noisy environments. To address the impact of noise, our approach tackles an identification problem that accounts for both systematic errors from finite-dimensional approximations and measurement noise in the data. By incorporating Bayesian learning and Kalman smoothing, the method simultaneously identifies the Koopman operator and estimates system states, effectively decoupling these two error sources. The method's efficiency and robustness are demonstrated through extensive experiments, showcasing its accuracy across varying noise levels.
Ultra-low-crosstalk Silicon Switches Driven Thermally and Electrically
Silicon photonic switches are widely considered as a cost-effective solution for addressing the ever-growing data traffic in datacenter networks, as they offer unique advantages such as low power consumption, low latency, small footprint and high bandwidth. Despite extensive research efforts, crosstalk in large-scale photonic circuits still poses a threat to the signal integrity. In this paper, we present two designs of silicon Mach-Zehnder Interferometer (MZI) switches achieving ultra-low-crosstalk, driven thermally and electrically. Each switch fabric is optimized at both the device and circuit level to suppress crosstalk and reduce system complexity. Notably, for the first time to the best of our knowledge, we harness the inherent self-heating effect in a carrier-injection-based MZI switch to create a pair of phase shifters that offer arbitrary phase differences. Such a pair of phase shifters induces matched insertion loss at each arm, thus minimizing crosstalk. Experimentally, an ultra-low crosstalk ratio below -40 dB is demonstrated for both thermo-optic (T-O) and electro-optic (E-O) switches. The T-O switch exhibits an on-chip loss of less than 5 dB with a switching time of 500 microseconds, whereas the E-O switch achieves an on-chip loss as low as 8.5 dB with a switching time of under 100 ns. In addition, data transmission of a 50 Gb/s on-off keying signal is demonstrated with high fidelity on the E-O switch, showing the great potential of the proposed switch designs.
comment: 12 pages, 5 figures
Optimized Excitation Signal Design Employing Receding Horizon Control
A novel excitation signal design strategy based on a receding horizon control inspired optimization is presented. The proposed method is shown to effectively generate space-filling designs within the input space of a nonlinear dynamic process, thereby enabling sophisticated acquisition of information in previously unexplored operational areas. Additionally, the strategy can intensify the exploitation of specific operational areas during information gathering, offering flexibility in meeting application-specific requirements.
comment: Will be published in 34th Workshop Computational Intelligence, Berlin (2024)
Optimized Excitation Signal Tailored to Pertinent Dynamic Process Characteristics
The effectiveness of data-driven techniques significantly relies on the input signal used to generate the training data. Nevertheless, there is a notable gap in research when it comes to designing excitation signals for identifying nonlinear dynamic systems, likely because of the challenges involved. Based on current knowledge, it is crucial for excitation signals to effectively capture the nonlinearity across the entire operational area and to gather insights into the area-specific dynamic process characteristics. The Incremental Dynamic Space-Filling Design (IDS-FID) strategy designs excitation signals to achieve a space-filling distribution across the input space of a nonlinear approximator used in external dynamics modeling, gathering information throughout its operational area. Simultaneously, the approach enables for a heightened focus on either the systems steady-state or transient responses during information acquisition by altering the excitation signals dynamics, facilitating targeted insights into dynamic process characteristics.
comment: Will be published in 4th MECC (2024)
Absolute centrality in a signed Friedkin-Johnsen based model: a graphical characterisation of influence
This paper studies the evolution of opinions governed by a Friedkin Johnsen (FJ) based model in arbitrary network structures with signed interactions. The agents contributing to the opinion formation are characterised as being influential. Initially, the agents are classified as opinion leaders and followers based on network connectivity and the nature of interactions. However, the addition of stubbornness leads to interesting behaviours wherein a non influential agent can now become influential and vice versa. Thereafter, a signal flow graph (SFG) based method is proposed to quantify the influence of an influential agents' opinions. Additionally, it helps illustrate the role played by network topology in shaping the final opinions of the agents. Based on this analysis, the absolute centrality measure is proposed to determine the overall influence of all the agents in the network. Unlike most of the existing measures, it is applicable to any network structure and considers the effect of stubbornness and antagonism. Examples are presented throughout the paper to illustrate and validate these results.
comment: 13 pages
MERIT: Multimodal Wearable Vital Sign Waveform Monitoring
Cardiovascular disease (CVD) is the leading cause of death and premature mortality worldwide, with occupational environments significantly influencing CVD risk, underscoring the need for effective cardiac monitoring and early warning systems. Existing methods of monitoring vital signs require subjects to remain stationary, which is impractical for daily monitoring as individuals are often in motion. To address this limitation, we propose MERIT, a multimodality-based wearable system designed for precise ECG waveform monitoring without movement restrictions. Daily activities, involving frequent arm movements, can significantly affect sensor data and complicate the reconstruction of accurate ECG signals. To mitigate motion impact and enhance ECG signal reconstruction, we introduce a deep independent component analysis (Deep-ICA) module and a multimodal fusion module. We conducted experiments with 15 subjects. Our results, compared with commercial wearable devices and existing methods, demonstrate that MERIT accurately reconstructs ECG waveforms during various office activities, offering a reliable solution for fine-grained cardiac monitoring in dynamic environments.
comment: 9 pages, 10 figures
AARK: An Open Toolkit for Autonomous Racing Research
Autonomous racing demands safe control of vehicles at their physical limits for extended periods of time, providing insights into advanced vehicle safety systems which increasingly rely on intervention provided by vehicle autonomy. Participation in this field carries with it a high barrier to entry. Physical platforms and their associated sensor suites require large capital outlays before any demonstrable progress can be made. Simulators allow researches to develop soft autonomous systems without purchasing a platform. However, currently available simulators lack visual and dynamic fidelity, can still be expensive to buy, lack customisation, and are difficult to use. AARK provides three packages, ACI, ACDG, and ACMPC. These packages enable research into autonomous control systems in the demanding environment of racing to bring more people into the field and improve reproducibility: ACI provides researchers with a computer vision-friendly interface to Assetto Corsa for convenient comparison and evaluation of autonomous control solutions; ACDG enables generation of depth, normal and semantic segmentation data for training computer vision models to use in perception systems; and ACMPC gives newcomers to the field a modular full-stack autonomous control solution, capable of controlling vehicles to build from. AARK aims to unify and democratise research into a field critical to providing safer roads and trusted autonomous systems.
comment: 7 pages, 5 figures
A Digital Twin Framework for Physical-Virtual Integration in V2X-Enabled Connected Vehicle Corridors
Transportation Cyber-Physical Systems (T-CPS) are critical in improving traffic safety, reliability, and sustainability by integrating computing, communication, and control in transportation systems. The connected vehicle corridor is at the forefront of this transformation, where Cellular Vehicle-to-Everything (C-V2X) technology facilitates real-time data exchange between infrastructure, vehicles, and road users. However, challenges remain in processing and synchronizing the vast V2X data from vehicles and roadside units, particularly when ensuring scalability, data integrity, and operational resilience. This paper presents a digital twin framework for T-CPS, developed from a real-world connected vehicle corridor to address these challenges. By leveraging C-V2X technology and real-time data from infrastructure, vehicles, and road users, the digital twin accurately replicates vehicle behaviors, signal phases, and traffic patterns within the CARLA simulation environment. This framework demonstrates high fidelity between physical and digital systems and ensures robust synchronization of vehicle trajectories and signal phases through extensive experiments. Moreover, the digital twin's scalable and redundant architecture enhances data integrity, making it capable of supporting future large-scale C-V2X deployments. The digital twin is a vital tool in T-CPS, enabling real-time traffic monitoring, prediction, and optimization to enhance the reliability and safety of transportation systems.
Interleaved One-Shot SPS Performance under Smart DoS Attacks in C-V2X Networks
This paper evaluates the performance of the one-shot Semi-Persistent Scheduling (SPS) mechanism in Cellular Vehicle-to-Everything (C-V2X) networks under Denial-of-Service (DoS) smart attack scenarios. The study focuses on the impact of these attacks on key performance metrics, including Packet Delivery Ratio (PDR), Inter-Packet Gap (IPG), and Age of Information (AoI). Through extensive Monte Carlo simulations, we demonstrate that the one-shot mechanism significantly enhances network resilience by mitigating the adverse effects of smart DoS attacks. The findings reveal that while the one-shot mechanism improves the PDR and reduces the IPG and AoI tail values, its effectiveness diminishes slightly in high-density vehicular environments. Nevertheless, the one-shot mechanism proves to be a robust solution for maintaining the stability and reliability of C-V2X communications under adversarial conditions.
RRT-CBF Based Motion Planning
Control barrier functions (CBF) are widely explored to enforce the safety-critical constraints on nonlinear systems recently. There are many researchers incorporating the control barrier functions into path planning algorithms to find a safe path, but these methods involve huge computational complexity or unidirectional randomness, resulting in arising of run-time. When safety constraints are satisfied, searching efficiency, and searching space are sacrificed. This paper combines the novel motion planning approach using rapid exploring random trees (RRT) algorithm with model predictive control (MPC) to enforce the CBF with dynamically updating constraints to get the safety-critical resolution of trajectory which will enable the robots not to collide with both static and dynamic circle obstacles as well as other moving robots while considering the model uncertainty in process. Besides, this paper first realizes application of CBF-RRT in robot arm model for nonlinear system.
comment: 20 pages, 25 figures
Energetic Resilience of Linear Driftless Systems
When a malfunction causes a control system to lose authority over a subset of its actuators, achieving a task may require spending additional energy in order to compensate for the effect of uncontrolled inputs. To understand this increase in energy, we introduce energetic resilience metrics that quantify the maximal additional energy required to achieve finite-time regulation in linear driftless systems that lose authority over some of their actuators. Using a technical lemma based on the calculus of variations, we first derive optimal control signals and minimum energies to achieve this task in both the nominal and malfunctioning systems. We then obtain a bound on the worst-case energy used by the malfunctioning system, and its exact expression in the special case of loss of authority over one actuator. Further considering this special case, we derive bounds on additive and multiplicative metrics for energetic resilience. A simulation example on a model of an underwater robot demonstrates that these bounds are useful in quantifying the increased energy used by a system suffering a partial loss of control authority.
comment: 9 pages, 2 figures
Strategic information disclosure with communication constraints and private preferences
Social-media platforms are one of the most prevalent communication media today. In such systems, a large amount of content is generated and available to the platform. However, not all content can be transmitted to every possible user at all times. At the other end are the users, who have their own preferences about which content they enjoy, which is often unknown ex ante to the platform. We model the interaction between the platform and the users as a signaling game with asymmetric information, where each user optimizes its preference disclosure policy, and the platform optimizes its information disclosure policy. We provide structural as well as existence of policies that constitute Bayesian Nash Equilibria, and necessary optimality conditions used to explicitly compute the optimal policies.
comment: Submitted to American Control Conference 2025
iWalker: Imperative Visual Planning for Walking Humanoid Robot
Humanoid robots, with the potential to perform a broad range of tasks in environments designed for humans, have been deemed crucial for the basis of general AI agents. When talking about planning and controlling, although traditional models and task-specific methods have been extensively studied over the past few decades, they are inadequate for achieving the flexibility and versatility needed for general autonomy. Learning approaches, especially reinforcement learning, are powerful and popular nowadays, but they are inherently "blind" during training, relying heavily on trials in simulation without proper guidance from physical principles or underlying dynamics. In response, we propose a novel end-to-end pipeline that seamlessly integrates perception, planning, and model-based control for humanoid robot walking. We refer to our method as iWalker, which is driven by imperative learning (IL), a self-supervising neuro-symbolic learning framework. This enables the robot to learn from arbitrary unlabeled data, significantly improving its adaptability and generalization capabilities. In experiments, iWalker demonstrates effectiveness in both simulated and real-world environments, representing a significant advancement toward versatile and autonomous humanoid robots.
Robust Multivariate Detection and Estimation with Fault Frequency Content Information
This paper studies the problem of fault detection and estimation (FDE) for linear time-invariant (LTI) systems with a particular focus on frequency content information of faults, possibly as multiple disjoint continuum ranges, and under both disturbances and stochastic noise. To ensure the worst-case fault sensitivity in the considered frequency ranges and mitigate the effects of disturbances and noise, an optimization framework incorporating a mixed H_/H2 performance index is developed to compute the optimal detection filter. Moreover, a thresholding rule is proposed to guarantee both the false alarm rate (FAR) and the fault detection rate (FDR). Next, shifting attention to fault estimation in specific frequency ranges, an exact reformulation of the optimal estimation filter design using the restricted Hinf performance index is derived, which is inherently non-convex. However, focusing on finite frequency samples and fixed poles, a lower bound is established via a highly tractable quadratic programming (QP) problem. This lower bound together with an alternating optimization (AO) approach to the original estimation problem leads to a suboptimality gap for the overall estimation filter design. The effectiveness of the proposed approaches is validated through applications of a non-minimum phase hydraulic turbine system and a multi-area power system.
comment: 31pages, 15 figures
Dissipativity-Based Distributed Droop-Free Controller and Communication Topology Co-Design for DC Microgrids
This paper presents a novel dissipativity-based distributed droop-free control approach for the voltage regulation problem in DC microgrids (MGs) comprised of an interconnected set of distributed generators (DGs), loads, and power lines. First, we describe the closed-loop DC MG as a networked system where the sets of DGs and lines (i.e., subsystems) are interconnected via a static interconnection matrix. This interconnection matrix demonstrates how the inputs and outputs of DGs and lines are connected with each other. Each DG has a local controller and a distributed global controller. To design the distributed global controllers, we use the dissipativity properties of the subsystems and formulate a linear matrix inequality (LMI) problem. To support the feasibility of this distributed global controller design, we identify a set of necessary local conditions, which we then enforce in a specifically developed LMI-based local controller design process. In contrast to existing DC MG control solutions that separate distributed controller and communication topology design problems, our approach proposes a unified framework for distributed controller and communication topology co-design. As the co-design process is LMI-based, it can be efficiently implemented and evaluated using existing software tools. The effectiveness of the proposed solution in terms of voltage regulation and current sharing is verified by simulating an islanded DC MG in a MATLAB/Simulink environment under different scenarios, such as load changes and topological constraint changes, and comparing its performance with the recent droop control approach.
Optimal Control on Positive Cones
An optimal control problem on finite-dimensional positive cones is stated. Under a critical assumption on the cone, the corresponding Bellman equation is satisfied by a linear function, which can be computed by convex optimization. A separate theorem relates the assumption on the cone to the existence of minimal elements in certain subsets of the dual cone. Three special cases are derived as examples. The first one, where the positive cone is the set of positive semi-definite matrices, reduces to standard linear quadratic control. The second one, where the positive cone is a polyhedron, reduces to a recent result on optimal control of positive systems. The third special case corresponds to linear quadratic control with additional structure, such as spatial invariance.
comment: 16 pages, to be published in the proceedings for the 2024 Conference on Decision and Control (CDC)
Modeling Fault Recovery and Transient Stability of Grid-Forming Converters Equipped With Current Reference Limitation
When grid-forming (GFM) inverter-based resources (IBRs) face severe grid disturbances (e.g., short-circuit faults), the current limitation mechanism may be triggered. Consequently, the GFM IBRs enter the current-saturation mode, inducing nonlinear dynamical behaviors and posing great challenges to the post-disturbance transient angle stability. This paper presents a systematic study to reveal the fault recovery behaviors of a GFM IBR and identify the risk of instability. A closed-form expression for the necessary condition that a GFM IBR returns from the current-saturation mode to the normal operation mode is presented. Based on these analyses, it is inferred that the angle of the magnitude-saturated current significantly affects the post-fault recovery and transient stability; with different angle selection, the system may follow multiple post-fault trajectories depending on those conditions: 1) Convergence to a normal stable equilibrium point (SEP), 2) convergence to a saturated stable equilibrium point (satSEP), or 3) divergence (instability). In this paper, the circumstances under which a GFM IBR cannot escape from the current-saturation mode are thoroughly investigated. The theoretical analyses are verified by dynamic simulations.
comment: 13 pages, 22 figures
Model Predictive Control for setpoint tracking
The main objective of tracking control is to steer the tracking error, that is the difference between the reference and the output, to zero while the plant's operation limits are satisfied. This requires that some assumptions on the evolution of the future values of the reference must be taken into account. Typically a simple evolution of the reference is considered, such as step, ramp, or parabolic reference signals. It is important to notice that the tracking problem considers possible variations in the reference to be tracked, such as steps or slope variations of the ramps. Then the tracking control problem is inherently uncertain, since the reference may differ from what is expected. If the value of the reference is changed, then there is no guarantee that the feasibility and stability properties of the resulting control law hold. This report presents the MPC for tracking (MPCT) approach, which ensures recursive feasibility and asymptotic stability of the setpoint when the value of the reference is changed.
Prediction-Free Coordinated Dispatch of Microgrid: A Data-Driven Online Optimization Approach
Traditional prediction-dependent dispatch methods can face challenges when renewables and prices predictions are unreliable in microgrid. Instead, this paper proposes a novel prediction-free two-stage coordinated dispatch approach in microgrid. Empirical learning is conducted during the offline stage, where we calculate the offline optimal state of charge (SOC) sequences for generic energy storage under different historical scenarios. During the online stage, we synthesize a dynamically updated reference for SOC and a dynamic opportunity price (DOP) based on empirical learning and real-time observations. They provide a global vision for online operation and effectively address the myopic tendencies inherent to online decision-making. The real-time control action, generated from online optimization algorithm, aims to minimize the operational costs while tracking the reference and considering DOP. Additionally, we develop an adaptive virtual-queue-based online optimization algorithm based on online convex optimization (OCO) framework. We provide theoretical proof that the proposed algorithm outperforms the existing OCO algorithms and achieves sublinear dynamic regret bound and sublinear strict constraint violation bound. Simulation-based studies demonstrate that, compared with model predictive control-based methods, it reduces operational costs and voltage violation rate by 5% and 9%, respectively.
Quantifying the Safety of Trajectories using Peak-Minimizing Control
This work quantifies the safety of trajectories of a dynamical system by the perturbation intensity required to render a system unsafe (crash into the unsafe set). Computation of this measure of safety is posed as a peak-minimizing optimal control problem. Convergent lower bounds on the minimal peak value of controller effort are computed using polynomial optimization and the moment-Sum-of-Squares hierarchy. The crash-safety framework is extended towards data-driven safety analysis by measuring safety as the maximum amount of data corruption required to crash into the unsafe set.
comment: 19 pages, 9 figures, 3 tables
Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach
The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft, operated by multiple operators each having heterogeneous valuations associated with their fleet, between vertiports, while enforcing the arrival, departure, and parking constraints at vertiports. Particularly, we propose an incentive-compatible and individually rational vertiport reservation mechanism that maximizes a social welfare metric, which encapsulates the objective of maximizing the overall valuations of all operators while minimizing the congestion at vertiports. Additionally, we improve the computational tractability of designing the reservation mechanism by proposing a mixed binary linear programming approach that leverages the network flow structure.
comment: 23 pages, 2 figures, 2 tables
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions, such as those encountered in robotic manipulation. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning a dynamical system model with convergence guarantees. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a recurrent neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties for modelling temporal dynamics. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Network (ESN) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
comment: 21 pages, 9 figures
Decentralized Optimization in Time-Varying Networks with Arbitrary Delays
We consider a decentralized optimization problem for networks affected by communication delays. Examples of such networks include collaborative machine learning, sensor networks, and multi-agent systems. To mimic communication delays, we add virtual non-computing nodes to the network, resulting in directed graphs. This motivates investigating decentralized optimization solutions on directed graphs. Existing solutions assume nodes know their out-degrees, resulting in limited applicability. To overcome this limitation, we introduce a novel gossip-based algorithm, called DT-GO, that does not need to know the out-degrees. The algorithm is applicable in general directed networks, for example networks with delays or limited acknowledgment capabilities. We derive convergence rates for both convex and non-convex objectives, showing that our algorithm achieves the same complexity order as centralized Stochastic Gradient Descent. In other words, the effects of the graph topology and delays are confined to higher-order terms. Additionally, we extend our analysis to accommodate time-varying network topologies. Numerical simulations are provided to support our theoretical findings.
comment: arXiv admin note: text overlap with arXiv:2401.11344
Robotics
Continuously Improving Mobile Manipulation with Autonomous Real-World RL
We present a fully autonomous real-world RL framework for mobile manipulation that can learn policies without extensive instrumentation or human supervision. This is enabled by 1) task-relevant autonomy, which guides exploration towards object interactions and prevents stagnation near goal states, 2) efficient policy learning by leveraging basic task knowledge in behavior priors, and 3) formulating generic rewards that combine human-interpretable semantic information with low-level, fine-grained observations. We demonstrate that our approach allows Spot robots to continually improve their performance on a set of four challenging mobile manipulation tasks, obtaining an average success rate of 80% across tasks, a 3-4 improvement over existing approaches. Videos can be found at https://continual-mobile-manip.github.io/
comment: CoRL 2024. Website at https://continual-mobile-manip.github.io/
LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner
Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks. Nevertheless, it remains a significant challenge to handle long-horizon tasks, especially in subtask identification and allocation for cooperative heterogeneous robot teams. To address this issue, we propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework that achieves state-of-the-art performance on long-horizon tasks. LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional heuristic search planner to achieve a high success rate and efficiency while demonstrating strong generalization across tasks. Additionally, we create MAT-THOR, a comprehensive benchmark that features household tasks with two different levels of complexity based on the AI2-THOR environment. The experimental results demonstrate that LaMMA-P achieves a 105% higher success rate and 36% higher efficiency than existing LM-based multi-agent planners. The experimental videos, code, and datasets of this work as well as the detailed prompts used in each module are available at https://lamma-p.github.io.
comment: Project website: https://lamma-p.github.io/
Online identification of skidding modes with interactive multiple model estimation
Skid-steered wheel mobile robots (SSWMRs) operate in a variety of outdoor environments exhibiting motion behaviors dominated by the effects of complex wheel-ground interactions. Characterizing these interactions is crucial both from the immediate robot autonomy perspective (for motion prediction and control) as well as a long-term predictive maintenance and diagnostics perspective. An ideal solution entails capturing precise state measurements for decisions and controls, which is considerably difficult, especially in increasingly unstructured outdoor regimes of operations for these robots. In this milieu, a framework to identify pre-determined discrete modes of operation can considerably simplify the motion model identification process. To this end, we propose an interactive multiple model (IMM) based filtering framework to probabilistically identify predefined robot operation modes that could arise due to traversal in different terrains or loss of wheel traction.
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Previous studies on robotic manipulation are based on a limited understanding of the underlying 3D motion constraints and affordances. To address these challenges, we propose a comprehensive paradigm, termed UniAff, that integrates 3D object-centric manipulation and task understanding in a unified formulation. Specifically, we constructed a dataset labeled with manipulation-related key attributes, comprising 900 articulated objects from 19 categories and 600 tools from 12 categories. Furthermore, we leverage MLLMs to infer object-centric representations for manipulation tasks, including affordance recognition and reasoning about 3D motion constraints. Comprehensive experiments in both simulation and real-world settings indicate that UniAff significantly improves the generalization of robotic manipulation for tools and articulated objects. We hope that UniAff will serve as a general baseline for unified robotic manipulation tasks in the future. Images, videos, dataset, and code are published on the project website at:https://sites.google.com/view/uni-aff/home
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant
In this paper, we introduce Robi Butler, a novel household robotic system that enables multimodal interactions with remote users. Building on the advanced communication interfaces, Robi Butler allows users to monitor the robot's status, send text or voice instructions, and select target objects by hand pointing. At the core of our system is a high-level behavior module, powered by Large Language Models (LLMs), that interprets multimodal instructions to generate action plans. These plans are composed of a set of open vocabulary primitives supported by Vision Language Models (VLMs) that handle both text and pointing queries. The integration of the above components allows Robi Butler to ground remote multimodal instructions in the real-world home environment in a zero-shot manner. We demonstrate the effectiveness and efficiency of this system using a variety of daily household tasks that involve remote users giving multimodal instructions. Additionally, we conducted a user study to analyze how multimodal interactions affect efficiency and user experience during remote human-robot interaction and discuss the potential improvements.
Visual collective behaviors on spherical robots
The implementation of collective motion, traditionally, disregard the limited sensing capabilities of an individual, to instead assuming an omniscient perception of the environment. This study implements a visual flocking model in a ``robot-in-the-loop'' approach to reproduce these behaviors with a flock composed of 10 independent spherical robots. The model achieves robotic collective motion by only using panoramic visual information of each robot, such as retinal position, optical size and optic flow of the neighboring robots. We introduce a virtual anchor to confine the collective robotic movements so to avoid wall interactions. For the first time, a simple visual robot-in-the-loop approach succeed in reproducing several collective motion phases, in particular, swarming, and milling. Another milestone achieved with by this model is bridging the gap between simulation and physical experiments by demonstrating nearly identical behaviors in both environments with the same visual model. To conclude, we show that our minimal visual collective motion model is sufficient to recreate most collective behaviors on a robot-in-the-loop system that is scalable, behaves as numerical simulations predict and is easily comparable to traditional models.
comment: 26 pages, 16 figures, journal bioinspired and biomimetics
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
One of the roadblocks for training generalist robotic models today is heterogeneity. Previous robot learning methods often collect data to train with one specific embodiment for one task, which is expensive and prone to overfitting. This work studies the problem of learning policy representations through heterogeneous pre-training on robot data across different embodiments and tasks at scale. We propose Heterogeneous Pre-trained Transformers (HPT), which pre-train a large, shareable trunk of a policy neural network to learn a task and embodiment agnostic shared representation. This general architecture aligns the specific proprioception and vision inputs from distinct embodiments to a short sequence of tokens and then processes such tokens to map to control robots for different tasks. Leveraging the recent large-scale multi-embodiment real-world robotic datasets as well as simulation, deployed robots, and human video datasets, we investigate pre-training policies across heterogeneity. We conduct experiments to investigate the scaling behaviors of training objectives, to the extent of 52 datasets. HPTs outperform several baselines and enhance the fine-tuned policy performance by over 20% on unseen tasks in multiple simulator benchmarks and real-world settings. See the project website (https://liruiw.github.io/hpt/) for code and videos.
comment: See the project website (https://liruiw.github.io/hpt/) for code and videos
Bi-directional Momentum-based Haptic Feedback and Control System for Dexterous Telemanipulation
Haptic feedback is essential for dexterous telemanipulation that enables operators to control robotic hands remotely with high skill and precision, mimicking a human hand's natural movement and sensation. However, current haptic methods for dexterous telemanipulation cannot support torque feedback, resulting in object rotation and rolling mismatches. The operator must make tedious adjustments in these tasks, leading to delays, reduced situational awareness, and suboptimal task performance. This work presents a Bi-directional Momentum-based Haptic Feedback and Control (Bi-Hap) system for real-time dexterous telemanipulation. Bi-Hap integrates multi-modal sensors to extract human interactive information with the object and share it with the robot's learning-based controller. A Field-Oriented Control (FOC) algorithm is developed to enable the integrated brushless active momentum wheel to generate precise torque and vibrative feedback, bridging the gap between human intent and robotic actions. Different feedback strategies are designed for varying error states to align with the operator's intuition. Extensive experiments with human subjects using a virtual Shadow Dexterous Hand demonstrate the effectiveness of Bi-Hap in enhancing task performance and user confidence. Bi-Hap achieved real-time feedback capability with low command following latency (delay<0.025s) and highly accurate torque feedback (RMSE<0.010 Nm).
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Opt2Skill: Imitating Dynamically-feasible Whole-Body Trajectories for Versatile Humanoid Loco-Manipulation
Humanoid robots are designed to perform diverse loco-manipulation tasks. However, they face challenges due to their high-dimensional and unstable dynamics, as well as the complex contact-rich nature of the tasks. Model-based optimal control methods offer precise and systematic control but are limited by high computational complexity and accurate contact sensing. On the other hand, reinforcement learning (RL) provides robustness and handles high-dimensional spaces but suffers from inefficient learning, unnatural motion, and sim-to-real gaps. To address these challenges, we introduce Opt2Skill, an end-to-end pipeline that combines model-based trajectory optimization with RL to achieve robust whole-body loco-manipulation. We generate reference motions for the Digit humanoid robot using differential dynamic programming (DDP) and train RL policies to track these trajectories. Our results demonstrate that Opt2Skill outperforms pure RL methods in both training efficiency and task performance, with optimal trajectories that account for torque limits enhancing trajectory tracking. We successfully transfer our approach to real-world applications.
Evaluating the Impact of Convolutional Neural Network Layer Depth on the Enhancement of Inertial Navigation System Solutions
Secure navigation is pivotal for several applications including autonomous vehicles, robotics, and aviation. The inertial navigation system estimates position, velocity, and attitude through dead reckoning especially when external references like GPS are unavailable. However, the three accelerometers and three gyroscopes that compose the system are exposed to various types of errors including bias errors, scale factor errors, and noise, which can significantly degrade the accuracy of navigation constituting also a key vulnerability of this system. This work aims to adopt a supervised convolutional neural network (ConvNet) to address this vulnerability inherent in inertial navigation systems. In addition to this, this paper evaluates the impact of the ConvNet layer's depth on the accuracy of these corrections. This evaluation aims to determine the optimal layer configuration maximizing the effectiveness of error correction in INS (Inertial Navigation System) leading to precise navigation solutions.
Impact of Tactile Sensor Quantities and Placements on Learning-based Dexterous Manipulation
Tactile information effectively enables faster training and better task performance for learning-based in-hand manipulation. Existing approaches are validated in simulated environments with a large number of tactile sensors. However, attaching such sensors to a real robot hand is not applicable due to high cost and physical limitations. To enable real-world adoption of tactile sensors, this study investigates the impact of tactile sensors, including their varying quantities and placements on robot hands, on the dexterous manipulation task performance and analyzes the importance of each. Through empirically decreasing the sensor quantities, we successfully find an optimized set of tactile sensors (21 sensors) configuration, which keeps over 93% task performance with only 20% sensor quantities compared to the original set (92 sensors) for the block manipulation task, leading to a potential reduction of over 80% in sensor manufacturing and design costs. To transform the empirical results into a generalizable understanding, we build a task performance prediction model with a weighted linear regression algorithm and use it to forecast the task performance with different sensor configurations. To show its generalizability, we verified this model in egg and pen manipulation tasks and achieved an average prediction error of 3.12%.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments
We present a novel autonomous robot navigation algorithm for outdoor environments that is capable of handling diverse terrain traversability conditions. Our approach, VLM-GroNav, uses vision-language models (VLMs) and integrates them with physical grounding that is used to assess intrinsic terrain properties such as deformability and slipperiness. We use proprioceptive-based sensing, which provides direct measurements of these physical properties, and enhances the overall semantic understanding of the terrains. Our formulation uses in-context learning to ground the VLM's semantic understanding with proprioceptive data to allow dynamic updates of traversability estimates based on the robot's real-time physical interactions with the environment. We use the updated traversability estimations to inform both the local and global planners for real-time trajectory replanning. We validate our method on a legged robot (Ghost Vision 60) and a wheeled robot (Clearpath Husky), in diverse real-world outdoor environments with different deformable and slippery terrains. In practice, we observe significant improvements over state-of-the-art methods by up to 50% increase in navigation success rate.
ALLO: A Photorealistic Dataset and Data Generation Pipeline for Anomaly Detection During Robotic Proximity Operations in Lunar Orbit ICRA'25
NASA's forthcoming Lunar Gateway space station, which will be uncrewed most of the time, will need to operate with an unprecedented level of autonomy. Enhancing autonomy on the Gateway presents several unique challenges, one of which is to equip the Canadarm3, the Gateway's external robotic system, with the capability to perform worksite monitoring. Monitoring will involve using the arm's inspection cameras to detect any anomalies within the operating environment, a task complicated by the widely-varying lighting conditions in space. In this paper, we introduce the visual anomaly detection and localization task for space applications and establish a benchmark with our novel synthetic dataset called ALLO (for Anomaly Localization in Lunar Orbit). We develop a complete data generation pipeline to create ALLO, which we use to evaluate the performance of state-of-the-art visual anomaly detection algorithms. Given the low tolerance for risk during space operations and the lack of relevant data, we emphasize the need for novel, robust, and accurate anomaly detection methods to handle the challenging visual conditions found in lunar orbit and beyond.
comment: Submitted to International Conference on Robotics and Automation (ICRA'25), Atlanta, USA, May 19-23, 2025
Multi-Robot Target Monitoring and Encirclement via Triggered Distributed Feedback Optimization
We design a distributed feedback optimization strategy, embedded into a modular ROS 2 control architecture, which allows a team of heterogeneous robots to cooperatively monitor and encircle a target while patrolling points of interest. Relying on the aggregative feedback optimization framework, we handle multi-robot dynamics while minimizing a global performance index depending on both microscopic (e.g., the location of single robots) and macroscopic variables (e.g., the spatial distribution of the team). The proposed distributed policy allows the robots to cooperatively address the global problem by employing only local measurements and neighboring data exchanges. These exchanges are performed through an asynchronous communication protocol ruled by locally-verifiable triggering conditions. We formally prove that our strategy steers the robots to a set of configurations representing stationary points of the considered optimization problem. The effectiveness and scalability of the overall strategy are tested via Monte Carlo campaigns of realistic Webots ROS 2 virtual experiments. Finally, the applicability of our solution is shown with real experiments on ground and aerial robots.
Automation from the Worker's Perspective
Common narratives about automation often pit new technologies against workers. The introduction of advanced machine tools, industrial robots, and AI have all been met with concern that technological progress will mean fewer jobs. However, workers themselves offer a more optimistic, nuanced perspective. Drawing on a far-reaching 2024 survey of more than 9,000 workers across nine countries, this paper finds that more workers report potential benefits from new technologies like robots and AI for their safety and comfort at work, their pay, and their autonomy on the job than report potential costs. Workers with jobs that ask them to solve complex problems, workers who feel valued by their employers, and workers who are motivated to move up in their careers are all more likely to see new technologies as beneficial. In contrast to assumptions in previous research, more formal education is in some cases associated with more negative attitudes toward automation and its impact on work. In an experimental setting, the prospect of financial incentives for workers improve their perceptions of automation technologies, whereas the prospect of increased input about how new technologies are used does not have a significant effect on workers' attitudes toward automation.
Efficient Driving Behavior Narration and Reasoning on Edge Device Using Large Language Models
Deep learning architectures with powerful reasoning capabilities have driven significant advancements in autonomous driving technology. Large language models (LLMs) applied in this field can describe driving scenes and behaviors with a level of accuracy similar to human perception, particularly in visual tasks. Meanwhile, the rapid development of edge computing, with its advantage of proximity to data sources, has made edge devices increasingly important in autonomous driving. Edge devices process data locally, reducing transmission delays and bandwidth usage, and achieving faster response times. In this work, we propose a driving behavior narration and reasoning framework that applies LLMs to edge devices. The framework consists of multiple roadside units, with LLMs deployed on each unit. These roadside units collect road data and communicate via 5G NSR/NR networks. Our experiments show that LLMs deployed on edge devices can achieve satisfactory response speeds. Additionally, we propose a prompt strategy to enhance the narration and reasoning performance of the system. This strategy integrates multi-modal information, including environmental, agent, and motion data. Experiments conducted on the OpenDV-Youtube dataset demonstrate that our approach significantly improves performance across both tasks.
comment: Submitted for possible journal publication
Design, manufacturing, and inverse dynamic modeling of soft parallel robots actuated by dielectric elastomer actuators
Soft parallel robots with their manipulation safety and low commercial cost show a promising future for delicate operations and safe human-robot interactions. However, promoting the use of electroactive polymers (EAPs) is still challenging due to the under-improving quality of the product and the dynamic modelling of the collaborations between multiple actuators. This article presents the design, fabrication, modelling and control of a parallel kinematics Delta robot actuated by dielectric elastomer actuators (DEAs). The trade-off between the actuation force and stroke is retaken by an angular stroke amplification mechanism, and the weight of the robot frame is reduced by utilizing 3D puzzling strip structures. A generic way of constructing a high-stability conductive paint on a silicon-based film has been achieved by laser scanning the DE-film and then sandwiching a conductive particle-based electrode with a paint which is mixed by the particles and photosensitive resin. Compared to the wildly used carbon grease, the fabricated electrode shows a higher consistency in its dynamic behaviour before and after the on-stand test. Finally, to predict the output force and inverse motion of the robot end effector, we constructed the inverse dynamic model by introducing an expanded Bergstrom-Boyce model to the constitutive behavior of the dielectric film. The experimental results show a prediction of robot output force with RSME of 12.4% when the end effector remains stationary, and a well-followed trajectory with less than RSME 2.5%.
comment: 17 pages, 12 figures
RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning ICRA2025
Sim-to-Real refers to the process of transferring policies learned in simulation to the real world, which is crucial for achieving practical robotics applications. However, recent Sim2real methods either rely on a large amount of augmented data or large learning models, which is inefficient for specific tasks. In recent years, radiance field-based reconstruction methods, especially the emergence of 3D Gaussian Splatting, making it possible to reproduce realistic real-world scenarios. To this end, we propose a novel real-to-sim-to-real reinforcement learning framework, RL-GSBridge, which introduces a mesh-based 3D Gaussian Splatting method to realize zero-shot sim-to-real transfer for vision-based deep reinforcement learning. We improve the mesh-based 3D GS modeling method by using soft binding constraints, enhancing the rendering quality of mesh models. We then employ a GS editing approach to synchronize rendering with the physics simulator, reflecting the interactions of the physical robot more accurately. Through a series of sim-to-real robotic arm experiments, including grasping and pick-and-place tasks, we demonstrate that RL-GSBridge maintains a satisfactory success rate in real-world task completion during sim-to-real transfer. Furthermore, a series of rendering metrics and visualization results indicate that our proposed mesh-based 3D Gaussian reduces artifacts in unstructured objects, demonstrating more realistic rendering performance.
comment: 7 pages, 5 figures, 4 tables, under review by ICRA2025
Distributed NeRF Learning for Collaborative Multi-Robot Perception
Effective environment perception is crucial for enabling downstream robotic applications. Individual robotic agents often face occlusion and limited visibility issues, whereas multi-agent systems can offer a more comprehensive mapping of the environment, quicker coverage, and increased fault tolerance. In this paper, we propose a collaborative multi-agent perception system where agents collectively learn a neural radiance field (NeRF) from posed RGB images to represent a scene. Each agent processes its local sensory data and shares only its learned NeRF model with other agents, reducing communication overhead. Given NeRF's low memory footprint, this approach is well-suited for robotic systems with limited bandwidth, where transmitting all raw data is impractical. Our distributed learning framework ensures consistency across agents' local NeRF models, enabling convergence to a unified scene representation. We show the effectiveness of our method through an extensive set of experiments on datasets containing challenging real-world scenes, achieving performance comparable to centralized mapping of the environment where data is sent to a central server for processing. Additionally, we find that multi-agent learning provides regularization benefits, improving geometric consistency in scenarios with sparse input views. We show that in such scenarios, multi-agent mapping can even outperform centralized training.
Self-Assessment of Evidential Grid Map Fusion for Robust Motion Planning
Conflicting sensor measurements pose a huge problem for the environment representation of an autonomous robot. Therefore, in this paper, we address the self-assessment of an evidential grid map in which data from conflicting LiDAR sensor measurements are fused, followed by methods for robust motion planning under these circumstances. First, conflicting measurements aggregated in Subjective-Logic-based evidential grid maps are classified. Then, a self-assessment framework evaluates these conflicts and estimates their severity for the overall system by calculating a degradation score. This enables the detection of calibration errors and insufficient sensor setups. In contrast to other motion planning approaches, the information gained from the evidential grid maps is further used inside our proposed path-planning algorithm. Here, the impact of conflicting measurements on the current motion plan is evaluated, and a robust and curious path-planning strategy is derived to plan paths under the influence of conflicting data. This ensures that the system integrity is maintained in severely degraded environment representations which can prevent the unnecessary abortion of planning tasks.
comment: Oliver Schumann, Thomas Wodtko, Michael Buchholz, Klaus Dietmayer
Active Neural Mapping at Scale
We introduce a NeRF-based active mapping system that enables efficient and robust exploration of large-scale indoor environments. The key to our approach is the extraction of a generalized Voronoi graph (GVG) from the continually updated neural map, leading to the synergistic integration of scene geometry, appearance, topology, and uncertainty. Anchoring uncertain areas induced by the neural map to the vertices of GVG allows the exploration to undergo adaptive granularity along a safe path that traverses unknown areas efficiently. Harnessing a modern hybrid NeRF representation, the proposed system achieves competitive results in terms of reconstruction accuracy, coverage completeness, and exploration efficiency even when scaling up to large indoor environments. Extensive results at different scales validate the efficacy of the proposed system.
Self-Assessment and Correction of Sensor Synchronization
We propose an approach to assess the synchronization of rigidly mounted sensors based on their rotational motion. Using function similarity measures combined with a sliding window approach, our approach is capable of estimating time-varying time offsets. Further, the estimated offset allows the correction of erroneously assigned time stamps on measurements. This mitigates the effect of synchronization issues on subsequent modules in autonomous software stacks, such as tracking systems that heavily rely on accurate measurement time stamps. Additionally, a self-assessment based on an uncertainty measure is derived, and correction strategies are described. Our approach is evaluated with Monte Carlo experiments containing different error patterns. The results show that our approach accurately estimates time offsets and, thus, is able to detect and assess synchronization issues. To further embrace the importance of our approach for autonomous systems, we investigate the effect of synchronization inconsistencies in tracking systems in more detail and demonstrate the beneficial effect of our proposed offset correction.
Bi-stable thin soft robot for in-plane locomotion in narrow space
Dielectric elastomer actuators (DEAs), also recognized as artificial muscle, have been widely developed for the soft locomotion robot. With the complaint skeleton and miniaturized dimension, they are well suited for the narrow space inspection. In this work, we propose a novel low profile (1.1mm) and lightweight (1.8g) bi-stable in-plane DEA (Bi-DEA) constructed by supporting a dielectric elastomer onto a flat bi-stable mechanism. It has an amplified displacement and output force compared with the in-plane DEA (I-DEA) without the bi-stable mechanism. Then, the Bi-DEA is applied to a thin soft robot, using three electrostatic adhesive pads (EA-Pads) as anchoring elements. This robot is capable of crawling and climbing to access millimetre-scale narrow gaps. A theoretical model of the bi-stable mechanism and the DEA are presented. The enhanced performance of the Bi-DEA induced by the mechanism is experimentally validated. EA-Pad provides the adhesion between the actuator and the locomotion substrate, allowing crawling and climbing on various surfaces, i.e., paper and acrylic. The thin soft robot has been demonstrated to be capable of crawling through a 4mm narrow gap with a speed up to 3.3mm/s (0.07 body length per second and 2.78 body thickness per second).
comment: 8 pages, 12 figures
Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
An end-to-end (E2E) visuomotor policy is typically treated as a unified whole, but recent approaches using out-of-domain (OOD) data to pretrain the visual encoder have cleanly separated the visual encoder from the network, with the remainder referred to as the policy. We propose Visual Alignment Testing, an experimental framework designed to evaluate the validity of this functional separation. Our results indicate that in E2E-trained models, visual encoders actively contribute to decision-making resulting from motor data supervision, contradicting the assumed functional separation. In contrast, OOD-pretrained models, where encoders lack this capability, experience an average performance drop of 42% in our benchmark results, compared to the state-of-the-art performance achieved by E2E policies. We believe this initial exploration of visual encoders' role can provide a first step towards guiding future pretraining methods to address their decision-making ability, such as developing task-conditioned or context-aware encoders.
Co-Movement and Trust Development in Human-Robot Teams
For humans and robots to form an effective human-robot team (HRT) there must be sufficient trust between team members throughout a mission. We analyze data from an HRT experiment focused on trust dynamics in teams of one human and two robots, where trust was manipulated by robots becoming temporarily unresponsive. Whole-body movement tracking was achieved using ultrasound beacons, alongside communications and performance logs from a human-robot interface. We find evidence that synchronization between time series of human-robot movement, within a certain spatial proximity, is correlated with changes in self-reported trust. This suggests that the interplay of proxemics and kinesics, i.e. moving together through space, where implicit communication via coordination can occur, could play a role in building and maintaining trust in human-robot teams. Thus, quantitative indicators of coordination dynamics between team members could be used to predict trust over time and also provide early warning signals of the need for timely trust repair if trust is damaged. Hence, we aim to develop the metrology of trust in mobile human-robot teams.
Active Listener: Continuous Generation of Listener's Head Motion Response in Dyadic Interactions
A key component of dyadic spoken interactions is the contextually relevant non-verbal gestures, such as head movements that reflect a listener's response to the interlocutor's speech. Although significant progress has been made in the context of generating co-speech gestures, generating listener's response has remained a challenge. We introduce the task of generating continuous head motion response of a listener in response to the speaker's speech in real time. To this end, we propose a graph-based end-to-end crossmodal model that takes interlocutor's speech audio as input and directly generates head pose angles (roll, pitch, yaw) of the listener in real time. Different from previous work, our approach is completely data-driven, does not require manual annotations or oversimplify head motion to merely nods and shakes. Extensive evaluation on the dyadic interaction sessions on the IEMOCAP dataset shows that our model produces a low overall error (4.5 degrees) and a high frame rate, thereby indicating its deployability in real-world human-robot interaction systems. Our code is available at - https://github.com/bigzen/Active-Listener
comment: 4+1 pages, 3 figures, 2 tables
Boosting Safe Human-Robot Collaboration Through Adaptive Collision Sensitivity ICRA 2025
What is considered safe for a robot operator during physical human-robot collaboration (HRC) is specified in corresponding HRC standards (e.g., the European ISO/TS 15066). The regime that allows collisions between the moving robot and the operator, called Power and Force Limiting (PFL), restricts the permissible contact forces. Using the same fixed contact thresholds on the entire robot surface results in significant and unnecessary productivity losses, as the robot needs to stop even when impact forces are within limits. Here we present a framework for setting the protective skin thresholds individually for different parts of the robot body and dynamically on the fly, based on the effective mass of each robot link and the link velocity. We perform experiments on a 6-axis collaborative robot arm (UR10e) completely covered with a sensitive skin (AIRSKIN) consisting of eleven individual pads. On a mock pick-and-place scenario with both transient and quasi-static collisions, we demonstrate how skin sensitivity influences the task performance and exerted force. We show an increase in productivity of almost 50% from the most conservative setting of collision thresholds to the most adaptive setting, while ensuring safety for human operators. The method is applicable to any robot for which the effective mass can be calculated.
comment: Submitted to ICRA 2025
ILeSiA: Interactive Learning of Situational Awareness from Camera Input
Learning from demonstration is a promising way of teaching robots new skills. However, a central problem when executing acquired skills is to recognize risks and failures. This is essential since the demonstrations usually cover only a few mostly successful cases. Inevitable errors during execution require specific reactions that were not apparent in the demonstrations. In this paper, we focus on teaching the robot situational awareness from an initial skill demonstration via kinesthetic teaching and sparse labeling of autonomous skill executions as safe or risky. At runtime, our system, called ILeSiA, detects risks based on the perceived camera images by encoding the images into a low-dimensional latent space representation and training a classifier based on the encoding and the provided labels. In this way, ILeSiA boosts the confidence and safety with which robotic skills can be executed. Our experiments demonstrate that classifiers, trained with only a small amount of user-provided data, can successfully detect numerous risks. The system is flexible because the risk cases are defined by labeling data. This also means that labels can be added as soon as risks are identified by a human supervisor. We provide all code and data required to reproduce our experiments at imitrob.ciirc.cvut.cz/publications/ilesia.
comment: 7 pages, 8 figures
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Robots' ability to follow language instructions and execute diverse 3D tasks is vital in robot learning. Traditional imitation learning-based methods perform well on seen tasks but struggle with novel, unseen ones due to variability. Recent approaches leverage large foundation models to assist in understanding novel tasks, thereby mitigating this issue. However, these methods lack a task-specific learning process, which is essential for an accurate understanding of 3D environments, often leading to execution failures. In this paper, we introduce GravMAD, a sub-goal-driven, language-conditioned action diffusion framework that combines the strengths of imitation learning and foundation models. Our approach breaks tasks into sub-goals based on language instructions, allowing auxiliary guidance during both training and inference. During training, we introduce Sub-goal Keypose Discovery to identify key sub-goals from demonstrations. Inference differs from training, as there are no demonstrations available, so we use pre-trained foundation models to bridge the gap and identify sub-goals for the current task. In both phases, GravMaps are generated from sub-goals, providing flexible 3D spatial guidance compared to fixed 3D positions. Empirical evaluations on RLBench show that GravMAD significantly outperforms state-of-the-art methods, with a 28.63% improvement on novel tasks and a 13.36% gain on tasks encountered during training. These results demonstrate GravMAD's strong multi-task learning and generalization in 3D manipulation. Video demonstrations are available at: https://gravmad.github.io.
comment: Under review
Robust Gaussian Splatting SLAM by Leveraging Loop Closure
3D Gaussian Splatting algorithms excel in novel view rendering applications and have been adapted to extend the capabilities of traditional SLAM systems. However, current Gaussian Splatting SLAM methods, designed mainly for hand-held RGB or RGB-D sensors, struggle with tracking drifts when used with rotating RGB-D camera setups. In this paper, we propose a robust Gaussian Splatting SLAM architecture that utilizes inputs from rotating multiple RGB-D cameras to achieve accurate localization and photorealistic rendering performance. The carefully designed Gaussian Splatting Loop Closure module effectively addresses the issue of accumulated tracking and mapping errors found in conventional Gaussian Splatting SLAM systems. First, each Gaussian is associated with an anchor frame and categorized as historical or novel based on its timestamp. By rendering different types of Gaussians at the same viewpoint, the proposed loop detection strategy considers both co-visibility relationships and distinct rendering outcomes. Furthermore, a loop closure optimization approach is proposed to remove camera pose drift and maintain the high quality of 3D Gaussian models. The approach uses a lightweight pose graph optimization algorithm to correct pose drift and updates Gaussians based on the optimized poses. Additionally, a bundle adjustment scheme further refines camera poses using photometric and geometric constraints, ultimately enhancing the global consistency of scenarios. Quantitative and qualitative evaluations on both synthetic and real-world datasets demonstrate that our method outperforms state-of-the-art methods in camera pose estimation and novel view rendering tasks. The code will be open-sourced for the community.
Robot Design Optimization with Rotational and Prismatic Joints using Black-Box Multi-Objective Optimization IROS2024
Robots generally have a structure that combines rotational joints and links in a serial fashion. On the other hand, various joint mechanisms are being utilized in practice, such as prismatic joints, closed links, and wire-driven systems. Previous research have focused on individual mechanisms, proposing methods to design robots capable of achieving given tasks by optimizing the length of links and the arrangement of the joints. In this study, we propose a method for the design optimization of robots that combine different types of joints, specifically rotational and prismatic joints. The objective is to automatically generate a robot that minimizes the number of joints and link lengths while accomplishing a desired task, by utilizing a black-box multi-objective optimization approach. This enables the simultaneous observation of a diverse range of body designs through the obtained Pareto solutions. Our findings confirm the emergence of practical and known combinations of rotational and prismatic joints, as well as the discovery of novel joint combinations.
comment: Accepted at IROS2024, website - https://haraduka.github.io/prismatic-joint-opt/
A Parallel-in-Time Newton's Method for Nonlinear Model Predictive Control
Model predictive control (MPC) is a powerful framework for optimal control of dynamical systems. However, MPC solvers suffer from a high computational burden that restricts their application to systems with low sampling frequency. This issue is further amplified in nonlinear and constrained systems that require nesting MPC solvers within iterative procedures. In this paper, we address these issues by developing parallel-in-time algorithms for constrained nonlinear optimization problems that take advantage of massively parallel hardware to achieve logarithmic computational time scaling over the planning horizon. We develop time-parallel second-order solvers based on interior point methods and the alternating direction method of multipliers, leveraging fast convergence and lower computational cost per iteration. The parallelization is based on a reformulation of the subproblems in terms of associative operations that can be parallelized using the associative scan algorithm. We validate our approach on numerical examples of nonlinear and constrained dynamical systems.
OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity
3D semantic occupancy prediction networks have demonstrated remarkable capabilities in reconstructing the geometric and semantic structure of 3D scenes, providing crucial information for robot navigation and autonomous driving systems. However, due to their large overhead from dense network structure designs, existing networks face challenges balancing accuracy and latency.In this paper, we introduce OccRWKV, an efficient semantic occupancy network inspired by Receptance Weighted Key Value (RWKV). OccRWKV separates semantics, occupancy prediction, and feature fusion into distinct branches, each incorporating Sem-RWKV and Geo-RWKV blocks. These blocks are designed to capture long-range dependencies, enabling the network to learn domain-specific representation (i.e., semantics and geometry), which enhances prediction accuracy. Leveraging the sparse nature of real-world 3D occupancy, we reduce computational overhead by projecting features into the bird's-eye view (BEV) space and propose a BEV-RWKV block for efficient feature enhancement and fusion. This enables real-time inference at 22.2 FPS without compromising performance. Experiments demonstrate that OccRWKV outperforms the state-of-the-art methods on the SemanticKITTI dataset, achieving a mIoU of 25.1 while being 20 times faster than the best baseline, Co-Occ, making it suitable for real-time deployment on robots to enhance autonomous navigation efficiency. Code and video are available on our project page: \url{https://jmwang0117.github.io/OccRWKV/}.
A Hybrid Model and Learning-Based Force Estimation Framework for Surgical Robots IROS 2024
Haptic feedback to the surgeon during robotic surgery would enable safer and more immersive surgeries but estimating tissue interaction forces at the tips of robotically controlled surgical instruments has proven challenging. Few existing surgical robots can measure interaction forces directly and the additional sensor may limit the life of instruments. We present a hybrid model and learning-based framework for force estimation for the Patient Side Manipulators (PSM) of a da Vinci Research Kit (dVRK). The model-based component identifies the dynamic parameters of the robot and estimates free-space joint torque, while the learning-based component compensates for environmental factors, such as the additional torque caused by trocar interaction between the PSM instrument and the patient's body wall. We evaluate our method in an abdominal phantom and achieve an error in force estimation of under 10% normalized root-mean-squared error. We show that by using a model-based method to perform dynamics identification, we reduce reliance on the training data covering the entire workspace. Although originally developed for the dVRK, the proposed method is a generalizable framework for other compliant surgical robots. The code is available at https://github.com/vu-maple-lab/dvrk_force_estimation.
comment: Accepted by IROS 2024
DynORecon: Dynamic Object Reconstruction for Navigation ICRA 2025
This paper presents DynORecon, a Dynamic Object Reconstruction system that leverages the information provided by Dynamic SLAM to simultaneously generate a volumetric map of observed moving entities while estimating free space to support navigation. By capitalising on the motion estimations provided by Dynamic SLAM, DynORecon continuously refines the representation of dynamic objects to eliminate residual artefacts from past observations and incrementally reconstructs each object, seamlessly integrating new observations to capture previously unseen structures. Our system is highly efficient (~20 FPS) and produces accurate (~10 cm) reconstructions of dynamic objects using simulated and real-world outdoor datasets.
comment: 7 pages, 6 figures, submitted to ICRA 2025
Playful DoggyBot: Learning Agile and Precise Quadrupedal Locomotion
Quadrupedal animals have the ability to perform agile while accurate tasks: a trained dog can chase and catch a flying frisbee before it touches the ground; a cat alone at home can jump and grab the door handle accurately. However, agility and precision are usually a trade-off in robotics problems. Recent works in quadruped robots either focus on agile but not-so-accurate tasks, such as locomotion in challenging terrain, or accurate but not-so-fast tasks, such as using an additional manipulator to interact with objects. In this work, we aim at an accurate and agile task, catching a small object hanging above the robot. We mount a passive gripper in front of the robot chassis, so that the robot has to jump and catch the object with extreme precision. Our experiment shows that our system is able to jump and successfully catch the ball at 1.05m high in simulation and 0.8m high in the real world, while the robot is 0.3m high when standing.
A Robotic System for Precision Pollination in Apples: Design, Development and Field Evaluation
Global food production depends upon successful pollination, a process that relies on natural and managed pollinators. However, natural pollinators are declining due to different factors, including climate change, habitat loss, and pesticide use. Thus, developing alternative pollination methods is essential for sustainable crop production. This paper introduces a robotic system for precision pollination in apples, which are not self-pollinating and require precise delivery of pollen to the stigmatic surfaces of the flowers. The proposed robotic system consists of a machine vision system to identify target flowers and a mechatronic system with a 6-DOF UR5e robotic manipulator and an electrostatic sprayer. Field trials of this system in 'Honeycrisp' and 'Fuji' apple orchards have shown promising results, with the ability to pollinate flower clusters at an average spray cycle time of 6.5 seconds. The robotic pollination system has achieved encouraging fruit set and quality, comparable to naturally pollinated fruits in terms of color, weight, diameter, firmness, soluble solids, and starch content. However, the results for fruit set and quality varied between different apple cultivars and pollen concentrations. This study demonstrates the potential for a robotic artificial pollination system to be an efficient and sustainable method for commercial apple production. Further research is needed to refine the system and assess its suitability across diverse orchard environments and apple cultivars.
Towards Effective Utilization of Mixed-Quality Demonstrations in Robotic Manipulation via Segment-Level Selection and Optimization
Data is crucial for robotic manipulation, as it underpins the development of robotic systems for complex tasks. While high-quality, diverse datasets enhance the performance and adaptability of robotic manipulation policies, collecting extensive expert-level data is resource-intensive. Consequently, many current datasets suffer from quality inconsistencies due to operator variability, highlighting the need for methods to utilize mixed-quality data effectively. To mitigate these issues, we propose "Select Segments to Imitate" (S2I), a framework that selects and optimizes mixed-quality demonstration data at the segment level, while ensuring plug-and-play compatibility with existing robotic manipulation policies. The framework has three components: demonstration segmentation dividing origin data into meaningful segments, segment selection using contrastive learning to find high-quality segments, and trajectory optimization to refine suboptimal segments for better policy learning. We evaluate S2I through comprehensive experiments in simulation and real-world environments across six tasks, demonstrating that with only 3 expert demonstrations for reference, S2I can improve the performance of various downstream policies when trained with mixed-quality demonstrations. Project website: https://tonyfang.net/s2i/.
comment: Project website: https://tonyfang.net/s2i/
WildFusion: Multimodal Implicit 3D Reconstructions in the Wild
We propose WildFusion, a novel approach for 3D scene reconstruction in unstructured, in-the-wild environments using multimodal implicit neural representations. WildFusion integrates signals from LiDAR, RGB camera, contact microphones, tactile sensors, and IMU. This multimodal fusion generates comprehensive, continuous environmental representations, including pixel-level geometry, color, semantics, and traversability. Through real-world experiments on legged robot navigation in challenging forest environments, WildFusion demonstrates improved route selection by accurately predicting traversability. Our results highlight its potential to advance robotic navigation and 3D mapping in complex outdoor terrains.
comment: Our project website is at: http://generalroboticslab.com/WildFusion
VAP: The Vulnerability-Adaptive Protection Paradigm Toward Reliable Autonomous Machines
The next ubiquitous computing platform, following personal computers and smartphones, is poised to be inherently autonomous, encompassing technologies like drones, robots, and self-driving cars. Ensuring reliability for these autonomous machines is critical. However, current resiliency solutions make fundamental trade-offs between reliability and cost, resulting in significant overhead in performance, energy consumption, and chip area. This is due to the "one-size-fits-all" approach commonly used, where the same protection scheme is applied throughout the entire software computing stack. This paper presents the key insight that to achieve high protection coverage with minimal cost, we must leverage the inherent variations in robustness across different layers of the autonomous machine software stack. Specifically, we demonstrate that various nodes in this complex stack exhibit different levels of robustness against hardware faults. Our findings reveal that the front-end of an autonomous machine's software stack tends to be more robust, whereas the back-end is generally more vulnerable. Building on these inherent robustness differences, we propose a Vulnerability-Adaptive Protection (VAP) design paradigm. In this paradigm, the allocation of protection resources - whether spatially (e.g., through modular redundancy) or temporally (e.g., via re-execution) - is made inversely proportional to the inherent robustness of tasks or algorithms within the autonomous machine system. Experimental results show that VAP provides high protection coverage while maintaining low overhead in both autonomous vehicle and drone systems.
comment: Communications of the ACM (CACM), Research and Advances, Vol 67, No.9, September 2024. ACM Link: https://dl.acm.org/doi/pdf/10.1145/3647638
Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems ICRA
This paper presents opt-in camera, a concept of privacy-preserving camera systems capable of recording only specific individuals in a crowd who explicitly consent to be recorded. Our system utilizes a mobile wireless communication tag attached to personal belongings as proof of opt-in and as a means of localizing tag carriers in video footage. Specifically, the on-ground positions of the wireless tag are first tracked over time using the unscented Kalman filter (UKF). The tag trajectory is then matched against visual tracking results for pedestrians found in videos to identify the tag carrier. Technically, we devise a dedicated trajectory matching technique based on constrained linear optimization, as well as a novel calibration technique that handles wireless tag-camera calibration and hyperparameter tuning for the UKF, which mitigates the non-line-of-sight (NLoS) issue in wireless localization. We realize the proposed opt-in camera system using ultra-wideband (UWB) devices and an off-the-shelf webcam installed in the environment. Experimental results demonstrate that our system can perform opt-in recording of individuals in near real-time at 10 fps, with reliable identification accuracy for a crowd of 8-23 people in a confined space.
comment: 7 pages, 6 figures, submitted to international conference on robotics and automation (ICRA) 2025
Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration ICRA 2025
Human-Robot Collaboration (HRC) is vital in Industry 4.0, using sensors, digital twins, collaborative robots (cobots), and intention-recognition models to have efficient manufacturing processes. However, Concept Drift is a significant challenge, where robots struggle to adapt to new environments. We address concept drift by integrating Adaptive Intelligence and self-labeling (SLB) to improve the resilience of intention-recognition in an HRC system. Our methodology begins with data collection using cameras and weight sensors, which is followed by annotation of intentions and state changes. Then we train various deep learning models with different preprocessing techniques for recognizing and predicting the intentions. Additionally, we developed a custom state detection algorithm for enhancing the accuracy of SLB, offering precise state-change definitions and timestamps to label intentions. Our results show that the MViT2 model with skeletal posture preprocessing achieves an accuracy of 83% on our data environment, compared to the 79% accuracy of MViT2 without skeleton posture extraction. Additionally, our SLB mechanism achieves a labeling accuracy of 91%, reducing a significant amount of time that would've been spent on manual annotation. Lastly, we observe swift scaling of model performance that combats concept drift by fine tuning on different increments of self-labeled data in a shifted domain that has key differences from the original training environment.. This study demonstrates the potential for rapid deployment of intelligent cobots in manufacturing through the steps shown in our methodology, paving a way for more adaptive and efficient HRC systems.
comment: 7 Pages, 9 Figures. 14 References. Submitted to IEEE RA-L Journal and ICRA 2025 Conference. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Enabling Multi-Robot Collaboration from Single-Human Guidance
Learning collaborative behaviors is essential for multi-agent systems. Traditionally, multi-agent reinforcement learning solves this implicitly through a joint reward and centralized observations, assuming collaborative behavior will emerge. Other studies propose to learn from demonstrations of a group of collaborative experts. Instead, we propose an efficient and explicit way of learning collaborative behaviors in multi-agent systems by leveraging expertise from only a single human. Our insight is that humans can naturally take on various roles in a team. We show that agents can effectively learn to collaborate by allowing a human operator to dynamically switch between controlling agents for a short period and incorporating a human-like theory-of-mind model of teammates. Our experiments showed that our method improves the success rate of a challenging collaborative hide-and-seek task by up to 58$% with only 40 minutes of human guidance. We further demonstrate our findings transfer to the real world by conducting multi-robot experiments.
Embodied Visuomotor Representation
Suppose you are at your desk looking at some objects on it. You don't know the precise distance from your eye to any particular object in meters. However, you can immediately reach out and touch any of them. Instead of the meter, your knowledge of distance is encoded in unknown but embodied units of action. In contrast, standard approaches in robotics assume calibration to the meter, so that separated vision and control processes can be interfaced. Consequently, robots are precisely manufactured and calibrated, resulting in expensive systems available in only a few configurations. In response, we propose Embodied Visuomotor Representation, a framework that allows distance to be measured by a robot's own actions and thus minimizes dependence on calibrated 3D sensors and physical models. Using it, we demonstrate that a robot without knowledge of its size, environmental scale, or its own strength can become capable of touching and clearing obstacles after several seconds of operation. Similarly, we demonstrate in simulation that an agent, without knowledge of its mass or strength, can jump a gap of unknown size after performing a few test oscillations. These experiments parallel bee and gerbil behavior, respectively.
comment: 47 pages, 10 figures, 1 table, under review
Decentralized Input and State Estimation for Multi-agent System with Dynamic Topology and Heterogeneous Sensor Network
A crucial challenge in decentralized systems is state estimation in the presence of unknown inputs, particularly within heterogeneous sensor networks with dynamic topologies. While numerous consensus algorithms have been introduced, they often require extensive information exchange or multiple communication iterations to ensure estimation accuracy. This paper proposes an efficient algorithm that achieves an unbiased and optimal solution comparable to filters with full information about other agents. This is accomplished through the use of information filter decomposition and the fusion of inputs via covariance intersection. Our method requires only a single communication iteration for exchanging individual estimates between agents, instead of multiple rounds of information exchange, thus preserving agents' privacy by avoiding the sharing of explicit observations and system equations. Furthermore, to address the challenges posed by dynamic communication topologies, we propose two practical strategies to handle issues arising from intermittent observations and incomplete state estimation, thereby enhancing the robustness and accuracy of the estimation process. Experiments and ablation studies conducted in both stationary and dynamic environments demonstrate the superiority of our algorithm over other baselines. Notably, it performs as well as, or even better than, algorithms that have a global view of all neighbors.
Object-Centric Kinodynamic Planning for Nonprehensile Robot Rearrangement Manipulation
Nonprehensile actions such as pushing are crucial for addressing multi-object rearrangement problems. To date, existing nonprehensile solutions are all robot-centric, i.e., the manipulation actions are generated with robot-relevant intent and their outcomes are passively evaluated afterwards. Such pipelines are very different from human strategies and are typically inefficient. To this end, this work proposes a novel object-centric planning paradigm and develops the first object-centric planner for general nonprehensile rearrangement problems. By assuming that each object can actively move without being driven by robot interactions, the object-centric planner focuses on planning desired object motions, which are realized via robot actions generated online via a closed-loop pushing strategy. Through extensive experiments and in comparison with state-of-the-art baselines in both simulation and on a physical robot, we show that our object-centric paradigm can generate more intuitive and task-effective robot actions with significantly improved efficiency. In addition, we propose a benchmarking protocol to standardize and facilitate future research in nonprehensile rearrangement.
Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models
Learning-based methods have achieved strong performance for quadrupedal locomotion. However, several challenges prevent quadrupeds from learning helpful indoor skills that require interaction with environments and humans: lack of end-effectors for manipulation, limited semantic understanding using only simulation data, and low traversability and reachability in indoor environments. We present a system for quadrupedal mobile manipulation in indoor environments. It uses a front-mounted gripper for object manipulation, a low-level controller trained in simulation using egocentric depth for agile skills like climbing and whole-body tilting, and pre-trained vision-language models (VLMs) with a third-person fisheye and an egocentric RGB camera for semantic understanding and command generation. We evaluate our system in two unseen environments without any real-world data collection or training. Our system can zero-shot generalize to these environments and complete tasks, like following user's commands to fetch a randomly placed stuff toy after climbing over a queen-sized bed, with a 60% success rate. Project website: https://helpful-doggybot.github.io/
comment: Project website: https://helpful-doggybot.github.io/
Micromanipulation System for Microscale Magnetic Component Alignment and Assembly
This paper presents a contact-based micromanipulation system for the alignment and installment of microscale magnets into micro robots and devices. Affixing tweezers to a three degree of freedom micromanipulator allows for precise movement of objects. The use of non-magnetic tweezers permits the assembly of magnetized robots, and a magnetic rotating stage allows multiple magnets to be installed into one device in different orientations. By re-orienting the tweezers on the micromanipulator at defined ninety-degree angles, it is possible to assemble a device with magnets oriented in any direction on XY, XZ, and YZ planes. This system is highly precise and flexible, and can be implemented with minimal custom-made parts, making it ideal for development of new magnetic technologies at the microscale.
comment: Included as a short paper in 2024 International Conference on Manipulation, Automation and Robotics at Small Scales
Constraining Gaussian Process Implicit Surfaces for Robot Manipulation via Dataset Refinement
Model-based control faces fundamental challenges in partially-observable environments due to unmodeled obstacles. We propose an online learning and optimization method to identify and avoid unobserved obstacles online. Our method, Constraint Obeying Gaussian Implicit Surfaces (COGIS), infers contact data using a combination of visual input and state tracking, informed by predictions from a nominal dynamics model. We then fit a Gaussian process implicit surface (GPIS) to these data and refine the dataset through a novel method of enforcing constraints on the estimated surface. This allows us to design a Model Predictive Control (MPC) method that leverages the obstacle estimate to complete multiple manipulation tasks. By modeling the environment instead of attempting to directly adapt the dynamics, our method succeeds at both low-dimensional peg-in-hole tasks and high-dimensional deformable object manipulation tasks. Our method succeeds in 10/10 trials vs 1/10 for a baseline on a real-world cable manipulation task under partial observability of the environment.
comment: Accepted to IEEE RA-L
Constraint-Aware Refinement for Safety Verification of Neural Feedback Loops
Neural networks (NNs) are becoming increasingly popular in the design of control pipelines for autonomous systems. However, since the performance of NNs can degrade in the presence of out-of-distribution data or adversarial attacks, systems that have NNs in their control pipelines, i.e., neural feedback loops (NFLs), need safety assurances before they can be applied in safety-critical situations. Reachability analysis offers a solution to this problem by calculating reachable sets that bound the possible future states of an NFL and can be checked against dangerous regions of the state space to verify that the system does not violate safety constraints. Since exact reachable sets are generally intractable to calculate, reachable set over approximations (RSOAs) are typically used. The problem with RSOAs is that they can be overly conservative, making it difficult to verify the satisfaction of safety constraints, especially over long time horizons or for highly nonlinear NN control policies. Refinement strategies such as partitioning or symbolic propagation are typically used to limit the conservativeness of RSOAs, but these approaches come with a high computational cost and often can only be used to verify safety for simple reachability problems. This paper presents Constraint-Aware Refinement for Verification (CARV): an efficient refinement strategy that reduces the conservativeness of RSOAs by explicitly using the safety constraints on the NFL to refine RSOAs only where necessary. We demonstrate that CARV can verify the safety of an NFL where other approaches either fail or take up to 60x longer and 40x the memory.
comment: 6 pages, 10 figures, submitted to L-CSS/ACC
Additively Manufactured Open-Source Quadruped Robots for Multi-Robot SLAM Applications
This work presents the design and development of the quadruped robot Squeaky to be used as a research and learning platform for single and multi-SLAM robotics, computer vision, and reinforcement learning. Affordable robots are becoming necessary when expanding from single to multi-robot applications, as the cost can increase exponentially as fleet size increases. SLAM is essential for a robot to perceive and localize within its environment to perform applications such as cave exploration, disaster assistance, and remote inspection. For improved efficiency, a fleet of robots can be employed to combine maps for multi-robot SLAM. Squeaky is an affordable quadrupedal robot, designed to have easily adaptable hardware and software, capable of creating a merged map under a shared network from multiple robots, and available open-source for the benefit of the research community.
Learning to Swim: Reinforcement Learning for 6-DOF Control of Thruster-driven Autonomous Underwater Vehicles
Controlling AUVs can be challenging because of the effect of complex non-linear hydrodynamic forces acting on the robot, which, unlike ground robots, are significant in water and cannot be ignored. The problem is especially challenging for small AUVs for which the dynamics can change significantly with payload changes and deployments under different water conditions. The common approach to AUV control is a combination of passive stabilization with added buoyancy on top and weights on the bottom, and a PID controller tuned for simple and smooth motion primitives. However, the approach comes at the cost of sluggish controls and often the need to re-tune controllers with configuration changes. We propose a fast (trainable in minutes), reinforcement learning based approach for full 6 degree of freedom (DOF) control of an AUV, enabled by a new, highly parallelized simulator for underwater vehicle dynamics. We demonstrate that the proposed simulator models approximate hydrodynamic forces with enough accuracy that a zero-shot transfer of the learned policy to a real robot produces performance comparable to a hand-tuned PID controller. Furthermore, we show that domain randomization on the simulator produces policies that are robust to small variations in vehicle's physical parameters.
An Overview of the Burer-Monteiro Method for Certifiable Robot Perception
This paper presents an overview of the Burer-Monteiro method (BM), a technique that has been applied to solve robot perception problems to certifiable optimality in real-time. BM is often used to solve semidefinite programming relaxations, which can be used to perform global optimization for non-convex perception problems. Specifically, BM leverages the low-rank structure of typical semidefinite programs to dramatically reduce the computational cost of performing optimization. This paper discusses BM in certifiable perception, with three main objectives: (i) to consolidate information from the literature into a unified presentation, (ii) to elucidate the role of the linear independence constraint qualification (LICQ), a concept not yet well-covered in certifiable perception literature, and (iii) to share practical considerations that are discussed among practitioners but not thoroughly covered in the literature. Our general aim is to offer a practical primer for applying BM towards certifiable perception.
comment: Accepted to 2024 Robotics: Science and Systems (RSS) Safe Autonomy Workshop
A study on the effects of mixed explicit and implicit communications in human-virtual-agent interactions
Communication between humans and robots (or virtual agents) is essential for interaction and often inspired by human communication, which uses gestures, facial expressions, gaze direction, and other explicit and implicit means. This work presents an interaction experiment where humans and virtual agents interact through explicit (gestures, manual entries using mouse and keyboard, voice, sound, and information on screen) and implicit (gaze direction, location, facial expressions, and raise of eyebrows) communication to evaluate the effect of mixed explicit-implicit communication against purely explicit communication. Results obtained using Bayesian parameter estimation show that the number of errors and task execution time did not significantly change when mixed explicit and implicit communications were used, and neither the perceived efficiency of the interaction. In contrast, acceptance, sociability, and transparency of the virtual agent increased when using mixed communication modalities (88.3%, 92%, and 92.9% of the effect size posterior distribution of each variable, respectively, were above the upper limit of the region of practical equivalence). This suggests that task-related measures, such as time, number of errors, and perceived efficiency of the interaction, have not been influenced by the communication type in our particular experiment. However, the improvement of subjective measures related to the virtual agent, such as acceptance, sociability, and transparency, suggests that humans are more receptive to mixed explicit and implicit communications.
comment: Main paper with 22 pages, 12 figures, 4 tables. Added supplementary material with 17 pages, 16 figures. Submitted to International Journal of Social Robotics
LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models
When humans create sculptures, we are able to reason about how geometrically we need to alter the clay state to reach our target goal. We are not computing point-wise similarity metrics, or reasoning about low-level positioning of our tools, but instead determining the higher-level changes that need to be made. In this work, we propose LLM-Craft, a novel pipeline that leverages large language models (LLMs) to iteratively reason about and generate deformation-based crafting action sequences. We simplify and couple the state and action representations to further encourage shape-based reasoning. To the best of our knowledge, LLM-Craft is the first system successfully leveraging LLMs for complex deformable object interactions. Through our experiments, we demonstrate that with the LLM-Craft framework, LLMs are able to successfully reason about the deformation behavior of elasto-plastic objects. Furthermore, we find that LLM-Craft is able to successfully create a set of simple letter shapes. Finally, we explore extending the framework to reaching more ambiguous semantic goals, such as "thinner" or "bumpy". For videos please see our website: https://sites.google.com/andrew.cmu.edu/llmcraft.
Di-NeRF: Distributed NeRF for Collaborative Learning with Relative Pose Refinement
Collaborative mapping of unknown environments can be done faster and more robustly than a single robot. However, a collaborative approach requires a distributed paradigm to be scalable and deal with communication issues. This work presents a fully distributed algorithm enabling a group of robots to collectively optimize the parameters of a Neural Radiance Field (NeRF). The algorithm involves the communication of each robot's trained NeRF parameters over a mesh network, where each robot trains its NeRF and has access to its own visual data only. Additionally, the relative poses of all robots are jointly optimized alongside the model parameters, enabling mapping with less accurate relative camera poses. We show that multi-robot systems can benefit from differentiable and robust 3D reconstruction optimized from multiple NeRFs. Experiments on real-world and synthetic data demonstrate the efficiency of the proposed algorithm. See the website of the project for videos of the experiments and supplementary material (https://sites.google.com/view/di-nerf/home).
comment: 9 pages, 11 figures, Accepted in IEEE-RA-L
HortiBot: An Adaptive Multi-Arm System for Robotic Horticulture of Sweet Peppers IROS
Horticultural tasks such as pruning and selective harvesting are labor intensive and horticultural staff are hard to find. Automating these tasks is challenging due to the semi-structured greenhouse workspaces, changing environmental conditions such as lighting, dense plant growth with many occlusions, and the need for gentle manipulation of non-rigid plant organs. In this work, we present the three-armed system HortiBot, with two arms for manipulation and a third arm as an articulated head for active perception using stereo cameras. Its perception system detects not only peppers, but also peduncles and stems in real time, and performs online data association to build a world model of pepper plants. Collision-aware online trajectory generation allows all three arms to safely track their respective targets for observation, grasping, and cutting. We integrated perception and manipulation to perform selective harvesting of peppers and evaluated the system in lab experiments. Using active perception coupled with end-effector force torque sensing for compliant manipulation, HortiBot achieves high success rates in our indoor pepper plant mock-up.
comment: Accepted for International Conference on Intelligent Robots and Systems (IROS) 2024. C. Lenz and R. Menon contributed equally
VLM-Auto: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes
Recent research on Large Language Models for autonomous driving shows promise in planning and control. However, high computational demands and hallucinations still challenge accurate trajectory prediction and control signal generation. Deterministic algorithms offer reliability but lack adaptability to complex driving scenarios and struggle with context and uncertainty. To address this problem, we propose VLM-Auto, a novel autonomous driving assistant system to empower the autonomous vehicles with adjustable driving behaviors based on the understanding of road scenes. A pipeline involving the CARLA simulator and Robot Operating System 2 (ROS2) verifying the effectiveness of our system is presented, utilizing a single Nvidia 4090 24G GPU while exploiting the capacity of textual output of the Visual Language Model (VLM). Besides, we also contribute a dataset containing an image set and a corresponding prompt set for fine-tuning the VLM module of our system. In CARLA experiments, our system achieved $97.82\%$ average precision on 5 types of labels in our dataset. In the real-world driving dataset, our system achieved $96.97\%$ prediction accuracy in night scenes and gloomy scenes. Our VLM-Auto dataset will be released at https://github.com/ZionGo6/Co-driver.
comment: The paper is accepted by the IEEE conference
Understanding cyclists' perception of driverless vehicles through eye-tracking and interviews
As automated vehicles (AVs) become increasingly popular, the question arises as to how cyclists will interact with such vehicles. This study investigated (1) whether cyclists spontaneously notice if a vehicle is driverless, (2) how well they perform a driver-detection task when explicitly instructed, and (3) how they carry out these tasks. Using a Wizard-of-Oz method, 37 participants cycled a designated route and encountered an AV multiple times in two experimental sessions. In Session 1, participants cycled the route uninstructed, while in Session 2, they were instructed to verbally report whether they detected the presence or absence of a driver. Additionally, we recorded participants' gaze behaviour with eye-tracking and their responses in post-session interviews. The interviews revealed that 30% of the cyclists spontaneously mentioned the absence of a driver (Session 1), and when instructed (Session 2), they detected the absence and presence of the driver with 93% accuracy. The eye-tracking data showed that cyclists looked more frequently and for longer at the vehicle in Session 2 compared to Session 1. Additionally, participants exhibited intermittent sampling of the vehicle, and they looked at the area in front of the vehicle when it was far away and towards the windshield region when it was closer. The post-session interviews also indicated that participants were curious, but felt safe, and reported a need to receive information about the AV's driving state. In conclusion, cyclists can detect the absence of a driver in the AV, and this detection may influence their perception of safety. Further research is needed to explore these findings in real-world traffic conditions.
Performance assessment of ADAS in a representative subset of critical traffic situations
As a variety of automated collision prevention systems gain presence within personal vehicles, rating and differentiating the automated safety performance of car models has become increasingly important for consumers, manufacturers, and insurers. In 2023, Swiss Re and partners initiated an eight-month long vehicle testing campaign conducted on a recognized UNECE type approval authority and Euro NCAP accredited proving ground in Germany. The campaign exposed twelve mass-produced vehicle models and one prototype vehicle fitted with collision prevention systems to a selection of safety-critical traffic scenarios representative of United States and European Union accident landscape. In this paper, we compare and evaluate the relative safety performance of these thirteen collision prevention systems (hardware and software stack) as demonstrated by this testing campaign. We first introduce a new scoring system which represents a test system's predicted impact on overall real-world collision frequency and reduction of collision impact energy, weighted based on the real-world relevance of the test scenario. Next, we introduce a novel metric that quantifies the realism of the protocol and confirm that our test protocol is a plausible representation of real-world driving. Finally, we find that the prototype system in its pre-release state outperforms the mass-produced (post-consumer-release) vehicles in the majority of the tested scenarios on the test track.
Efficient Path Planning in Large Unknown Environments with Switchable System Models for Automated Vehicles
Large environments are challenging for path planning algorithms as the size of the configuration space increases. Furthermore, if the environment is mainly unexplored, large amounts of the path are planned through unknown areas. Hence, a complete replanning of the entire path occurs whenever the path collides with newly discovered obstacles. We propose a novel method that stops the path planning algorithm after a certain distance. It is used to navigate the algorithm in large environments and is not prone to problems of existing navigation approaches. Furthermore, we developed a method to detect significant environment changes to allow a more efficient replanning. At last, we extend the path planner to be used in the U-Shift concept vehicle. It can switch to another system model and rotate around the center of its rear axis. The results show that the proposed methods generate nearly identical paths compared to the standard Hybrid A* while drastically reducing the execution time. Furthermore, we show that the extended path planning algorithm enables the efficient use of the maneuvering capabilities of the concept vehicle to plan concise paths in narrow environments.
Globally Optimal GNSS Multi-Antenna Lever Arm Calibration
Sensor calibration is crucial for autonomous driving, providing the basis for accurate localization and consistent data fusion. Enabling the use of high-accuracy GNSS sensors, this work focuses on the antenna lever arm calibration. We propose a globally optimal multi-antenna lever arm calibration approach based on motion measurements. For this, we derive an optimization method that further allows the integration of a-priori knowledge. Globally optimal solutions are obtained by leveraging the Lagrangian dual problem and a primal recovery strategy. Generally, motion-based calibration for autonomous vehicles is known to be difficult due to cars' predominantly planar motion. Therefore, we first describe the motion requirements for a unique solution and then propose a planar motion extension to overcome this issue and enable a calibration based on the restricted motion of autonomous vehicles. Last we present and discuss the results of our thorough evaluation. Using simulated and augmented real-world data, we achieve accurate calibration results and fast run times that allow online deployment.
Reflectivity Is All You Need!: Advancing LiDAR Semantic Segmentation
LiDAR semantic segmentation frameworks predominantly use geometry-based features to differentiate objects within a scan. Although these methods excel in scenarios with clear boundaries and distinct shapes, their performance declines in environments where boundaries are indistinct, particularly in off-road contexts. To address this issue, recent advances in 3D segmentation algorithms have aimed to leverage raw LiDAR intensity readings to improve prediction precision. However, despite these advances, existing learning-based models face challenges in linking the complex interactions between raw intensity and variables such as distance, incidence angle, material reflectivity, and atmospheric conditions. Building upon our previous work, this paper explores the advantages of employing calibrated intensity (also referred to as reflectivity) within learning-based LiDAR semantic segmentation frameworks. We start by demonstrating that adding reflectivity as input enhances the LiDAR semantic segmentation model by providing a better data representation. Extensive experimentation with the Rellis-3d off-road dataset shows that replacing intensity with reflectivity results in a 4\% improvement in mean Intersection over Union (mIoU) for off-road scenarios. We demonstrate the potential benefits of using calibrated intensity for semantic segmentation in urban environments (SemanticKITTI) and for cross-sensor domain adaptation. Additionally, we tested the Segment Anything Model (SAM) using reflectivity as input, resulting in improved segmentation masks for LiDAR images.
LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning
Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework, LTLDoG, that modifies the inference steps of the reverse process given an instruction specified using finite linear temporal logic ($\text{LTL}_f$). LTLDoG leverages a satisfaction value function on $\text{LTL}_f$ and guides the sampling steps using its gradient field. This value function can also be trained to generalize to new instructions not observed during training, enabling flexible test-time adaptability. Experiments in robot navigation and manipulation illustrate that the method is able to generate trajectories that satisfy formulae that specify obstacle avoidance and visitation sequences. Code and supplementary material are available online at https://github.com/clear-nus/ltldog.
SAM: Semi-Active Mechanism for Extensible Continuum Manipulator and Real-time Hysteresis Compensation Control Algorithm
Cable-Driven Continuum Manipulators (CDCMs) enable scar-free procedures but face limitations in workspace and control accuracy due to hysteresis. We introduce an extensible CDCM with a Semi-active Mechanism (SAM) and develop a real-time hysteresis compensation control algorithm using a Temporal Convolutional Network (TCN) based on data collected from fiducial markers and RGBD sensing. Performance validation shows the proposed controller significantly reduces hysteresis by up to 69.5% in random trajectory tracking test and approximately 26% in the box pointing task. The SAM mechanism enables access to various lesions without damaging surrounding tissues. The proposed controller with TCN-based compensation effectively predicts hysteresis behavior and minimizes position and joint angle errors in real-time, which has the potential to enhance surgical task performance.
comment: 22 pages, 19 figures, 9 tables
ContactHandover: Contact-Guided Robot-to-Human Object Handover IROS 2024
Robot-to-human object handover is an important step in many human robot collaboration tasks. A successful handover requires the robot to maintain a stable grasp on the object while making sure the human receives the object in a natural and easy-to-use manner. We propose ContactHandover, a robot to human handover system that consists of two phases: a contact-guided grasping phase and an object delivery phase. During the grasping phase, ContactHandover predicts both 6-DoF robot grasp poses and a 3D affordance map of human contact points on the object. The robot grasp poses are re-ranked by penalizing those that block human contact points, and the robot executes the highest ranking grasp. During the delivery phase, the robot end effector pose is computed by maximizing human contact points close to the human while minimizing the human arm joint torques and displacements. We evaluate our system on 27 diverse household objects and show that our system achieves better visibility and reachability of human contacts to the receiver compared to several baselines. More results can be found on https://clairezixiwang.github.io/ContactHandover.github.io
comment: Accepted to IROS 2024. Project website: https://clairezixiwang.github.io/ContactHandover.github.io/
An Effectiveness Study Across Baseline and Neural Network-based Force Estimation Methods on the da Vinci Research Kit Si System
In this study, we further investigate the robustness and generalization ability of an neural network (NN) based force estimation method, using the da Vinci Research Kit Si (dVRK-Si). To evaluate our method's performance, we compare the force estimation accuracy with several baseline methods. We conduct comparative studies between the dVRK classic and dVRK-Si systems to benchmark the effectiveness of these approaches. We conclude that the NN-based method provides comparable force estimation accuracy across the two systems, as the average root mean square error (RMSE) over the average range of force ratio is approximately 3.07% for the dVRK classic, and 5.27% for the dVRK-Si. On the dVRK-Si, the force estimation RMSEs for all the baseline methods are 2 to 4 times larger than the NN-based method in all directions. One possible reason is, we made assumptions in the baseline methods that static forces remain the same or dynamics is time-invariant. These assumptions may hold for the dVRK Classic, as it has pre-loaded weight and maintains horizontal self balance. Since the dVRK-Si configuration does not have this property, assumptions do not hold anymore, therefore the NN-based method significantly outperforms.
comment: Accepted by the Hamlyn Symposium on Medical Robotics 2024
The Importance of Coordinate Frames in Dynamic SLAM ICRA 2024
Most Simultaneous localisation and mapping (SLAM) systems have traditionally assumed a static world, which does not align with real-world scenarios. To enable robots to safely navigate and plan in dynamic environments, it is essential to employ representations capable of handling moving objects. Dynamic SLAM is an emerging field in SLAM research as it improves the overall system accuracy while providing additional estimation of object motions. State-of-the-art literature informs two main formulations for Dynamic SLAM, representing dynamic object points in either the world or object coordinate frame. While expressing object points in a local reference frame may seem intuitive, it may not necessarily lead to the most accurate and robust solutions. This paper conducts and presents a thorough analysis of various Dynamic SLAM formulations, identifying the best approach to address the problem. To this end, we introduce a front-end agnostic framework using GTSAM that can be used to evaluate various Dynamic SLAM formulations.
comment: 7 pages, 4 figures, accepted by ICRA 2024
Safety-Critical Planning and Control for Dynamic Obstacle Avoidance Using Control Barrier Functions
Dynamic obstacle avoidance is a challenging topic for optimal control and optimization-based trajectory planning problems. Many existing works use Control Barrier Functions (CBFs) to enforce safety constraints for control systems. CBFs are typically formulated based on the distance to obstacles, or integrated with path planning algorithms as a safety enhancement tool. However, these approaches usually require knowledge of the obstacle boundary equations or have very slow computational efficiency. In this paper, we propose a framework based on model predictive control (MPC) with discrete-time high-order CBFs (DHOCBFs) to generate a collision-free trajectory. The DHOCBFs are first obtained from convex polytopes generated through grid mapping, without the need to know the boundary equations of obstacles. Additionally, a path planning algorithm is incorporated into this framework to ensure the global optimality of the generated trajectory. We demonstrate through numerical examples that our framework allows a unicycle robot to safely and efficiently navigate tight, dynamically changing environments with both convex and nonconvex obstacles. By comparing our method to established CBF-based benchmarks, we demonstrate superior computing efficiency, length optimality, and feasibility in trajectory generation and obstacle avoidance.
comment: 8 pages, 6 figures. arXiv admin note: text overlap with arXiv:2210.04361
RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged Locomotion
This paper presents a control framework that combines model-based optimal control and reinforcement learning (RL) to achieve versatile and robust legged locomotion. Our approach enhances the RL training process by incorporating on-demand reference motions generated through finite-horizon optimal control, covering a broad range of velocities and gaits. These reference motions serve as targets for the RL policy to imitate, leading to the development of robust control policies that can be learned with reliability. Furthermore, by utilizing realistic simulation data that captures whole-body dynamics, RL effectively overcomes the inherent limitations in reference motions imposed by modeling simplifications. We validate the robustness and controllability of the RL training process within our framework through a series of experiments. In these experiments, our method showcases its capability to generalize reference motions and effectively handle more complex locomotion tasks that may pose challenges for the simplified model, thanks to RL's flexibility. Additionally, our framework effortlessly supports the training of control policies for robots with diverse dimensions, eliminating the necessity for robot-specific adjustments in the reward function and hyperparameters.
comment: The paper has been accepted for publication in IEEE Robotics and Automation Letters (RA-L). You can find the copyright information on the front page of the paper. The supplementary video is available in https://www.youtube.com/watch?v=qPttVfzGS84
Roadmaps with Gaps over Controllers: Achieving Efficiency in Planning under Dynamics IROS
This paper aims to improve the computational efficiency of motion planning for mobile robots with non-trivial dynamics through the use of learned controllers. Offline, a system-specific controller is first trained in an empty environment. Then, for the target environment, the approach constructs a data structure, a "Roadmap with Gaps," to approximately learn how to solve planning queries using the learned controller. The roadmap nodes correspond to local regions. Edges correspond to applications of the learned controller that approximately connect these regions. Gaps arise as the controller does not perfectly connect pairs of individual states along edges. Online, given a query, a tree sampling-based motion planner uses the roadmap so that the tree's expansion is informed towards the goal region. The tree expansion selects local subgoals given a wavefront on the roadmap that guides towards the goal. When the controller cannot reach a subgoal region, the planner resorts to random exploration to maintain probabilistic completeness and asymptotic optimality. The accompanying experimental evaluation shows that the approach significantly improves the computational efficiency of motion planning on various benchmarks, including physics-based vehicular models on uneven and varying friction terrains as well as a quadrotor under air pressure effects.
comment: To be presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024. Website: https://prx-kinodynamic.github.io/projects/rogue
Hybrid Video Anomaly Detection for Anomalous Scenarios in Autonomous Driving BMVC 2024
In autonomous driving, the most challenging scenarios can only be detected within their temporal context. Most video anomaly detection approaches focus either on surveillance or traffic accidents, which are only a subfield of autonomous driving. We present HF$^2$-VAD$_{AD}$, a variation of the HF$^2$-VAD surveillance video anomaly detection method for autonomous driving. We learn a representation of normality from a vehicle's ego perspective and evaluate pixel-wise anomaly detections in rare and critical scenarios.
comment: Daniel Bogdoll and Jan Imhof contributed equally. Accepted for publication at BMVC 2024 RROW workshop
UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving BMVC 2024
Dealing with atypical traffic scenarios remains a challenging task in autonomous driving. However, most anomaly detection approaches cannot be trained on raw sensor data but require exposure to outlier data and powerful semantic segmentation models trained in a supervised fashion. This limits the representation of normality to labeled data, which does not scale well. In this work, we revisit unsupervised anomaly detection and present UMAD, leveraging generative world models and unsupervised image segmentation. Our method outperforms state-of-the-art unsupervised anomaly detection.
comment: Daniel Bogdoll and No\"el Ollick contributed equally. Accepted for publication at BMVC 2024 RROW workshop
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
In recent years, the Robotics field has initiated several efforts toward building generalist robot policies through large-scale multi-task Behavior Cloning. However, direct deployments of these policies have led to unsatisfactory performance, where the policy struggles with unseen states and tasks. How can we break through the performance plateau of these models and elevate their capabilities to new heights? In this paper, we propose FLaRe, a large-scale Reinforcement Learning fine-tuning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques. Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance both on previously demonstrated and on entirely novel tasks and embodiments. Specifically, on a set of long-horizon mobile manipulation tasks, FLaRe achieves an average success rate of 79.5% in unseen environments, with absolute improvements of +23.6% in simulation and +30.7% on real robots over prior SoTA methods. By utilizing only sparse rewards, our approach can enable generalizing to new capabilities beyond the pretraining data with minimal human effort. Moreover, we demonstrate rapid adaptation to new embodiments and behaviors with less than a day of fine-tuning. Videos can be found on the project website at https://robot-flare.github.io/
Two Results on LPT: A Near-Linear Time Algorithm and Parcel Delivery using Drones
The focus of this paper is to increase our understanding of the Longest Processing Time First (LPT) heuristic. LPT is a classical heuristic for the fundamental problem of uniform machine scheduling. For different machine speeds, LPT was first considered by Gonzalez et al (SIAM J. Computing, 1977). Since then, extensive work has been done to improve the approximation factor of the LPT heuristic. However, all known implementations of the LPT heuristic take $O(mn)$ time, where $m$ is the number of machines and $n$ is the number of jobs. In this work, we come up with the first near-linear time implementation for LPT. Specifically, the running time is $O((n+m)(\log^2{m}+\log{n}))$. Somewhat surprisingly, the result is obtained by mapping the problem to dynamic maintenance of lower envelope of lines, which has been well studied in the computational geometry community. Our second contribution is to analyze the performance of LPT for the Drones Warehouse Problem (DWP), which is a natural generalization of the uniform machine scheduling problem motivated by drone-based parcel delivery from a warehouse. In this problem, a warehouse has multiple drones and wants to deliver parcels to several customers. Each drone picks a parcel from the warehouse, delivers it, and returns to the warehouse (where it can also get charged). The speeds and battery lives of the drones could be different, and due to the limited battery life, each drone has a bounded range in which it can deliver parcels. The goal is to assign parcels to the drones so that the time taken to deliver all the parcels is minimized. We prove that the natural approach of solving this problem via the LPT heuristic has an approximation factor of $\phi$, where $\phi \approx 1.62$ is the golden ratio.
comment: To appear in FSTTCS 2024
CROSS-GAiT: Cross-Attention-Based Multimodal Representation Fusion for Parametric Gait Adaptation in Complex Terrains
We present CROSS-GAiT, a novel algorithm for quadruped robots that uses Cross Attention to fuse terrain representations derived from visual and time-series inputs, including linear accelerations, angular velocities, and joint efforts. These fused representations are used to adjust the robot's step height and hip splay, enabling adaptive gaits that respond dynamically to varying terrain conditions. We generate these terrain representations by processing visual inputs through a masked Vision Transformer (ViT) encoder and time-series data through a dilated causal convolutional encoder. The cross-attention mechanism then selects and integrates the most relevant features from each modality, combining terrain characteristics with robot dynamics for better-informed gait adjustments. CROSS-GAiT uses the combined representation to dynamically adjust gait parameters in response to varying and unpredictable terrains. We train CROSS-GAiT on data from diverse terrains, including asphalt, concrete, brick pavements, grass, dense vegetation, pebbles, gravel, and sand. Our algorithm generalizes well and adapts to unseen environmental conditions, enhancing real-time navigation performance. CROSS-GAiT was implemented on a Ghost Robotics Vision 60 robot and extensively tested in complex terrains with high vegetation density, uneven/unstable surfaces, sand banks, deformable substrates, etc. We observe at least a 7.04% reduction in IMU energy density and a 27.3% reduction in total joint effort, which directly correlates with increased stability and reduced energy usage when compared to state-of-the-art methods. Furthermore, CROSS-GAiT demonstrates at least a 64.5% increase in success rate and a 4.91% reduction in time to reach the goal in four complex scenarios. Additionally, the learned representations perform 4.48% better than the state-of-the-art on a terrain classification task.
DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models
Perception systems play a crucial role in autonomous driving, incorporating multiple sensors and corresponding computer vision algorithms. 3D LiDAR sensors are widely used to capture sparse point clouds of the vehicle's surroundings. However, such systems struggle to perceive occluded areas and gaps in the scene due to the sparsity of these point clouds and their lack of semantics. To address these challenges, Semantic Scene Completion (SSC) jointly predicts unobserved geometry and semantics in the scene given raw LiDAR measurements, aiming for a more complete scene representation. Building on promising results of diffusion models in image generation and super-resolution tasks, we propose their extension to SSC by implementing the noising and denoising diffusion processes in the point and semantic spaces individually. To control the generation, we employ semantic LiDAR point clouds as conditional input and design local and global regularization losses to stabilize the denoising process. We evaluate our approach on autonomous driving datasets and our approach outperforms the state-of-the-art for SSC.
comment: Under review
Multiagent Systems
LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner
Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks. Nevertheless, it remains a significant challenge to handle long-horizon tasks, especially in subtask identification and allocation for cooperative heterogeneous robot teams. To address this issue, we propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework that achieves state-of-the-art performance on long-horizon tasks. LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional heuristic search planner to achieve a high success rate and efficiency while demonstrating strong generalization across tasks. Additionally, we create MAT-THOR, a comprehensive benchmark that features household tasks with two different levels of complexity based on the AI2-THOR environment. The experimental results demonstrate that LaMMA-P achieves a 105% higher success rate and 36% higher efficiency than existing LM-based multi-agent planners. The experimental videos, code, and datasets of this work as well as the detailed prompts used in each module are available at https://lamma-p.github.io.
comment: Project website: https://lamma-p.github.io/
MARLadona -- Towards Cooperative Team Play Using Multi-Agent Reinforcement Learning
Robot soccer, in its full complexity, poses an unsolved research challenge. Current solutions heavily rely on engineered heuristic strategies, which lack robustness and adaptability. Deep reinforcement learning has gained significant traction in various complex robotics tasks such as locomotion, manipulation, and competitive games (e.g., AlphaZero, OpenAI Five), making it a promising solution to the robot soccer problem. This paper introduces MARLadona. A decentralized multi-agent reinforcement learning (MARL) training pipeline capable of producing agents with sophisticated team play behavior, bridging the shortcomings of heuristic methods. Further, we created an open-source multi-agent soccer environment based on Isaac Gym. Utilizing our MARL framework and a modified a global entity encoder as our core architecture, our approach achieves a 66.8% win rate against HELIOS agent, which employs a state-of-the-art heuristic strategy. Furthermore, we provided an in-depth analysis of the policy behavior and interpreted the agent's intention using the critic network.
Can We Break the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning?
Standard multi-agent reinforcement learning (MARL) algorithms are vulnerable to sim-to-real gaps. To address this, distributionally robust Markov games (RMGs) have been proposed to enhance robustness in MARL by optimizing the worst-case performance when game dynamics shift within a prescribed uncertainty set. Solving RMGs remains under-explored, from problem formulation to the development of sample-efficient algorithms. A notorious yet open challenge is if RMGs can escape the curse of multiagency, where the sample complexity scales exponentially with the number of agents. In this work, we propose a natural class of RMGs where the uncertainty set of each agent is shaped by both the environment and other agents' strategies in a best-response manner. We first establish the well-posedness of these RMGs by proving the existence of game-theoretic solutions such as robust Nash equilibria and coarse correlated equilibria (CCE). Assuming access to a generative model, we then introduce a sample-efficient algorithm for learning the CCE whose sample complexity scales polynomially with all relevant parameters. To the best of our knowledge, this is the first algorithm to break the curse of multiagency for RMGs.
Fuel tax loss in a world of electric mobility: A window of opportunity for congestion pricing
The continued transition towards electric mobility will decrease energy tax revenues worldwide, which has substantial implications for government funds. At the same time, demand for transportation is ever increasing, which in turn increases congestion problems. Combining both challenges, this paper assesses the effectiveness of congestion pricing as a sustainable revenue stream to offset fuel tax loss in 2030 while simultaneously enhancing efficiency in the transport sector. A congestion-based toll that is road-and-time-variant is simulated for the greater Berlin area in Germany using the multi-agent transport simulation (MATSim) software. Through the simulation results, this paper quantifies the impacts of the toll on the governmental revenue, traffic management, environment, social welfare, and the distribution effects. We find that the revenue from congestion tolls in a metropolitan area can compensate the reduction in passenger car fuel tax. Furthermore, a remarkable welfare surplus is observed. The toll also successfully incentivises transport users to adjust their travel behaviour, which reduces traffic delay time by 28%. CO2 emissions as a key metric for decarbonisation of the transport sector decrease by more than 5%. The analysis of the distribution effects suggests that a redistribution plan with a focus on the middle-low-income residents and the outer boroughs could help the policy gain more public acceptance.
comment: A part of this work has been presented in the International Conference on Operations Research OR2024
Variational Auto-encoder Based Solutions to Interactive Dynamic Influence Diagrams
Addressing multiagent decision problems in AI, especially those involving collaborative or competitive agents acting concurrently in a partially observable and stochastic environment, remains a formidable challenge. While Interactive Dynamic Influence Diagrams~(I-DIDs) have offered a promising decision framework for such problems, they encounter limitations when the subject agent encounters unknown behaviors exhibited by other agents that are not explicitly modeled within the I-DID. This can lead to sub-optimal responses from the subject agent. In this paper, we propose a novel data-driven approach that utilizes an encoder-decoder architecture, particularly a variational autoencoder, to enhance I-DID solutions. By integrating a perplexity-based tree loss function into the optimization algorithm of the variational autoencoder, coupled with the advantages of Zig-Zag One-Hot encoding and decoding, we generate potential behaviors of other agents within the I-DID that are more likely to contain their true behaviors, even from limited interactions. This new approach enables the subject agent to respond more appropriately to unknown behaviors, thus improving its decision quality. We empirically demonstrate the effectiveness of the proposed approach in two well-established problem domains, highlighting its potential for handling multi-agent decision problems with unknown behaviors. This work is the first time of using neural networks based approaches to deal with the I-DID challenge in agent planning and learning problems.
Classification with a Network of Partially Informative Agents: Enabling Wise Crowds from Individually Myopic Classifiers
We consider the problem of classification with a (peer-to-peer) network of heterogeneous and partially informative agents, each receiving local data generated by an underlying true class, and equipped with a classifier that can only distinguish between a subset of the entire set of classes. We propose an iterative algorithm that uses the posterior probabilities of the local classifier and recursively updates each agent's local belief on all the possible classes, based on its local signals and belief information from its neighbors. We then adopt a novel distributed min-rule to update each agent's global belief and enable learning of the true class for all agents. We show that under certain assumptions, the beliefs on the true class converge to one asymptotically almost surely. We provide the asymptotic convergence rate, and demonstrate the performance of our algorithm through simulation with image data and experimented with random forest classifiers and MobileNet.
comment: 12 pages, 15 figures, 60th Annual Allerton Conference on Communication, Control, and Computing
Enabling Multi-Robot Collaboration from Single-Human Guidance
Learning collaborative behaviors is essential for multi-agent systems. Traditionally, multi-agent reinforcement learning solves this implicitly through a joint reward and centralized observations, assuming collaborative behavior will emerge. Other studies propose to learn from demonstrations of a group of collaborative experts. Instead, we propose an efficient and explicit way of learning collaborative behaviors in multi-agent systems by leveraging expertise from only a single human. Our insight is that humans can naturally take on various roles in a team. We show that agents can effectively learn to collaborate by allowing a human operator to dynamically switch between controlling agents for a short period and incorporating a human-like theory-of-mind model of teammates. Our experiments showed that our method improves the success rate of a challenging collaborative hide-and-seek task by up to 58$% with only 40 minutes of human guidance. We further demonstrate our findings transfer to the real world by conducting multi-robot experiments.
Decentralized Input and State Estimation for Multi-agent System with Dynamic Topology and Heterogeneous Sensor Network
A crucial challenge in decentralized systems is state estimation in the presence of unknown inputs, particularly within heterogeneous sensor networks with dynamic topologies. While numerous consensus algorithms have been introduced, they often require extensive information exchange or multiple communication iterations to ensure estimation accuracy. This paper proposes an efficient algorithm that achieves an unbiased and optimal solution comparable to filters with full information about other agents. This is accomplished through the use of information filter decomposition and the fusion of inputs via covariance intersection. Our method requires only a single communication iteration for exchanging individual estimates between agents, instead of multiple rounds of information exchange, thus preserving agents' privacy by avoiding the sharing of explicit observations and system equations. Furthermore, to address the challenges posed by dynamic communication topologies, we propose two practical strategies to handle issues arising from intermittent observations and incomplete state estimation, thereby enhancing the robustness and accuracy of the estimation process. Experiments and ablation studies conducted in both stationary and dynamic environments demonstrate the superiority of our algorithm over other baselines. Notably, it performs as well as, or even better than, algorithms that have a global view of all neighbors.
The Patterns of Life Human Mobility Simulation SP
We demonstrate the Patterns of Life Simulation to create realistic simulations of human mobility in a city. This simulation has recently been used to generate massive amounts of trajectory and check-in data. Our demonstration focuses on using the simulation twofold: (1) using the graphical user interface (GUI), and (2) running the simulation headless by disabling the GUI for faster data generation. We further demonstrate how the Patterns of Life simulation can be used to simulate any region on Earth by using publicly available data from OpenStreetMap. Finally, we also demonstrate recent improvements to the scalability of the simulation allows simulating up to 100,000 individual agents for years of simulation time. During our demonstration, as well as offline using our guides on GitHub, participants will learn: (1) The theories of human behavior driving the Patters of Life simulation, (2) how to simulate to generate massive amounts of synthetic yet realistic trajectory data, (3) running the simulation for a region of interest chosen by participants using OSM data, (4) learn the scalability of the simulation and understand the properties of generated data, and (5) manage thousands of parallel simulation instances running concurrently.
comment: Accepted paper to SIGSPATIAL 2024 main conference
From homeostasis to resource sharing: Biologically and economically compatible multi-objective multi-agent AI safety benchmarks
Developing safe agentic AI systems benefits from automated empirical testing that conforms with human values, a subfield that is largely underdeveloped at the moment. To contribute towards this topic, present work focuses on introducing biologically and economically motivated themes that have been neglected in the safety aspects of modern reinforcement learning literature, namely homeostasis, balancing multiple objectives, bounded objectives, diminishing returns, sustainability, and multi-agent resource sharing. We implemented eight main benchmark environments on the above themes, for illustrating the potential shortcomings of current mainstream discussions on AI safety.
comment: 18 pages, 14 figures, 1 tables
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
Agents, as user-centric tools, are increasingly deployed for human task delegation, assisting with a broad spectrum of requests by generating thoughts, engaging with user proxies, and producing action plans. However, agents based on large language models (LLMs) often face substantial planning latency due to two primary factors: the efficiency limitations of the underlying LLMs due to their large size and high demand, and the structural complexity of the agents due to the extensive generation of intermediate thoughts to produce the final output. Given that inefficiency in service provision can undermine the value of automation for users, this paper presents a human-centered efficient agent planning method -- Interactive Speculative Planning -- aiming at enhancing the efficiency of agent planning through both system design and human-AI interaction. Our approach advocates for the co-design of the agent system and user interface, underscoring the importance of an agent system that can fluidly manage user interactions and interruptions. By integrating human interruptions as a fundamental component of the system, we not only make it more user-centric but also expedite the entire process by leveraging human-in-the-loop interactions to provide accurate intermediate steps. Code and data will be released.
comment: 27 pages, 22 figures
A Hypergraph Approach to Distributed Broadcast
This paper explores the distributed broadcast problem within the context of network communications, a critical challenge in decentralized information dissemination. We put forth a novel hypergraph-based approach to address this issue, focusing on minimizing the number of broadcasts to ensure comprehensive data sharing among all network users. The key contributions of this work include the establishment of a general lower bound for the problem using the min-cut capacity of hypergraphs, and a distributed broadcast for quasi-trees (DBQT) algorithm tailored for the unique structure of quasi-trees, which is proven to be optimal. This paper advances both network communication strategies and hypergraph theory, with implications for a wide range of real-world applications, from vehicular and sensor networks to distributed storage systems.
Enhancing Automotive User Experience with Dynamic Service Orchestration for Software Defined Vehicles
With the increasing demand for dynamic behaviors in automotive use cases, Software Defined Vehicles (SDVs) have emerged as a promising solution by bringing dynamic onboard service management capabilities. While users may request a wide range of services during vehicle operation, background tasks such as cooperative Vehicle-to-Everything (V2X) services can activate on-the-fly in response to real-time road conditions. In this dynamic environment, the efficient allocation of onboard resources becomes a complex challenge, in order to meet mixed-criticality onboard Quality-of-Service (QoS) network requirements while ensuring an optimal user experience. Additionally, the ever-evolving real-time network connectivity and computational availability conditions further complicate the process. In this context, we present a dynamic resource-based onboard service orchestration algorithm that considers real-time in-vehicle and V2X network health, along with onboard resource constraints, to select degraded modes for onboard applications and maximize user experience. To enable dynamic orchestration, we introduce the concept of Automotive eXperience Integrity Level (AXIL) which expresses a runtime priority for non-safety-critical applications. This algorithm produces near-optimal solutions while significantly reducing execution time compared to straightforward methods as demonstrated by simulation results. With this approach, we aim to enable efficient onboard execution for a user experience-focused service orchestration.
comment: Preprint for submission at IEEE Transactions on Intelligent Transportation Systems
Systems and Control (CS)
Continuously Improving Mobile Manipulation with Autonomous Real-World RL
We present a fully autonomous real-world RL framework for mobile manipulation that can learn policies without extensive instrumentation or human supervision. This is enabled by 1) task-relevant autonomy, which guides exploration towards object interactions and prevents stagnation near goal states, 2) efficient policy learning by leveraging basic task knowledge in behavior priors, and 3) formulating generic rewards that combine human-interpretable semantic information with low-level, fine-grained observations. We demonstrate that our approach allows Spot robots to continually improve their performance on a set of four challenging mobile manipulation tasks, obtaining an average success rate of 80% across tasks, a 3-4 improvement over existing approaches. Videos can be found at https://continual-mobile-manip.github.io/
comment: CoRL 2024. Website at https://continual-mobile-manip.github.io/
Visual collective behaviors on spherical robots
The implementation of collective motion, traditionally, disregard the limited sensing capabilities of an individual, to instead assuming an omniscient perception of the environment. This study implements a visual flocking model in a ``robot-in-the-loop'' approach to reproduce these behaviors with a flock composed of 10 independent spherical robots. The model achieves robotic collective motion by only using panoramic visual information of each robot, such as retinal position, optical size and optic flow of the neighboring robots. We introduce a virtual anchor to confine the collective robotic movements so to avoid wall interactions. For the first time, a simple visual robot-in-the-loop approach succeed in reproducing several collective motion phases, in particular, swarming, and milling. Another milestone achieved with by this model is bridging the gap between simulation and physical experiments by demonstrating nearly identical behaviors in both environments with the same visual model. To conclude, we show that our minimal visual collective motion model is sufficient to recreate most collective behaviors on a robot-in-the-loop system that is scalable, behaves as numerical simulations predict and is easily comparable to traditional models.
comment: 26 pages, 16 figures, journal bioinspired and biomimetics
Formally Verified Physics-Informed Neural Control Lyapunov Functions
Control Lyapunov functions are a central tool in the design and analysis of stabilizing controllers for nonlinear systems. Constructing such functions, however, remains a significant challenge. In this paper, we investigate physics-informed learning and formal verification of neural network control Lyapunov functions. These neural networks solve a transformed Hamilton-Jacobi-Bellman equation, augmented by data generated using Pontryagin's maximum principle. Similar to how Zubov's equation characterizes the domain of attraction for autonomous systems, this equation characterizes the null-controllability set of a controlled system. This principled learning of neural network control Lyapunov functions outperforms alternative approaches, such as sum-of-squares and rational control Lyapunov functions, as demonstrated by numerical examples. As an intermediate step, we also present results on the formal verification of quadratic control Lyapunov functions, which, aided by satisfiability modulo theories solvers, can perform surprisingly well compared to more sophisticated approaches and efficiently produce global certificates of null-controllability.
Quantifying Metrics for Wildfire Ignition Risk from Geographic Data in Power Shutoff Decision-Making
Faults on power lines and other electric equipment are known to cause wildfire ignitions. To mitigate the threat of wildfire ignitions from electric power infrastructure, many utilities preemptively de-energize power lines, which may result in power shutoffs. Data regarding wildfire ignition risks are key inputs for effective planning of power line de-energizations. However, there are multiple ways to formulate risk metrics that spatially aggregate wildfire risk map data, and there are different ways of leveraging this data to make decisions. The key contribution of this paper is to define and compare the results of employing six metrics for quantifying the wildfire ignition risks of power lines from risk maps, considering both threshold- and optimization-based methods for planning power line de-energizations. The numeric results use the California Test System (CATS), a large-scale synthetic grid model with power line corridors accurately representing California infrastructure, in combination with real Wildland Fire Potential Index data for a full year. This is the first application of optimal power shutoff planning on such a large and realistic test case. Our results show that the choice of risk metric significantly impacts the lines that are de-energized and the resulting load shed. We find that the optimization-based method results in significantly less load shed than the threshold-based method while achieving the same risk reduction.
A simple controller design to achieve iso-damping robustness: Non-iterative data-driven approach based on fractional-order reference model
This study proposes a simple controller design approach to achieve a class of robustness, the so-called iso-damping property. The proposed approach can be executed using only one-shot input/output data. An accurate mathematical model of a controlled plant is not required. The model-reference control problem is defined to achieve the desired closed-loop specifications, including the iso-damping, and the reference model is designed on the basis of fractional-order calculus. The optimization problem for the model-reference control is formulated using the one-shot input/output data while considering the bounded-input bounded-output (BIBO) stability from a bounded reference input to a bounded output. The iso-damping robust controller is obtained by solving the optimization problem. The representative advantages of the proposed approach over the conventional methods are the simplicity, practicality, and reliability from the viewpoint of the unnecessity of the plant model and explicit consideration of the BIBO stability from a bounded reference input to a bounded output. Numerical examples demonstrate the validity of the proposed approach.
Design, manufacturing, and inverse dynamic modeling of soft parallel robots actuated by dielectric elastomer actuators
Soft parallel robots with their manipulation safety and low commercial cost show a promising future for delicate operations and safe human-robot interactions. However, promoting the use of electroactive polymers (EAPs) is still challenging due to the under-improving quality of the product and the dynamic modelling of the collaborations between multiple actuators. This article presents the design, fabrication, modelling and control of a parallel kinematics Delta robot actuated by dielectric elastomer actuators (DEAs). The trade-off between the actuation force and stroke is retaken by an angular stroke amplification mechanism, and the weight of the robot frame is reduced by utilizing 3D puzzling strip structures. A generic way of constructing a high-stability conductive paint on a silicon-based film has been achieved by laser scanning the DE-film and then sandwiching a conductive particle-based electrode with a paint which is mixed by the particles and photosensitive resin. Compared to the wildly used carbon grease, the fabricated electrode shows a higher consistency in its dynamic behaviour before and after the on-stand test. Finally, to predict the output force and inverse motion of the robot end effector, we constructed the inverse dynamic model by introducing an expanded Bergstrom-Boyce model to the constitutive behavior of the dielectric film. The experimental results show a prediction of robot output force with RSME of 12.4% when the end effector remains stationary, and a well-followed trajectory with less than RSME 2.5%.
comment: 17 pages, 12 figures
Controlling sharpness, SNR and SAR for 3D FSE at 7T by end-to-end learning
Purpose: To non-heuristically identify dedicated variable flip angle (VFA) schemes optimized for the point-spread function (PSF) and signal-to-noise ratio (SNR) of multiple tissues in 3D FSE sequences with very long echo trains at 7T. Methods: The proposed optimization considers predefined SAR constraints and target contrast using an end-to-end learning framework. The cost function integrates components for contrast fidelity (SNR) and a penalty term to minimize image blurring (PSF) for multiple tissues. By adjusting the weights of PSF/SNR cost-function components, PSF- and SNR-optimized VFAs were derived and tested in vivo using both the open-source Pulseq standard on two volunteers as well as vendor protocols on a 7T MRI system with parallel transmit extension on three volunteers. Results: PSF-optimized VFAs resulted in significantly reduced image blurring compared to standard VFAs for T2w while maintaining contrast fidelity. Small white and gray matter structures, as well as blood vessels, are more visible with PSF-optimized VFAs. Quantitative analysis shows that the optimized VFA yields 50% less deviation from a sinc-like reference PSF than the standard VFA. The SNR-optimized VFAs yielded images with significantly improved SNR in a white and gray matter region relative to standard (81.2\pm18.4 vs. 41.2\pm11.5, respectively) as trade-off for elevated image blurring. Conclusion: This study demonstrates the potential of end-to-end learning frameworks to optimize VFA schemes in very long echo trains for 3D FSE acquisition at 7T in terms of PSF and SNR. It paves the way for fast and flexible adjustment of the trade-off between PSF and SNR for 3D FSE.
comment: Submitted to Magnetic Resonance in Medicine for peer-review
Resource Allocation for Stable LLM Training in Mobile Edge Computing
As mobile devices increasingly become focal points for advanced applications, edge computing presents a viable solution to their inherent computational limitations, particularly in deploying large language models (LLMs). However, despite the advancements in edge computing, significant challenges remain in efficient training and deploying LLMs due to the computational demands and data privacy concerns associated with these models. This paper explores a collaborative training framework that integrates mobile users with edge servers to optimize resource allocation, thereby enhancing both performance and efficiency. Our approach leverages parameter-efficient fine-tuning (PEFT) methods, allowing mobile users to adjust the initial layers of the LLM while edge servers handle the more demanding latter layers. Specifically, we formulate a multi-objective optimization problem to minimize the total energy consumption and delay during training. We also address the common issue of instability in model performance by incorporating stability enhancements into our objective function. Through novel fractional programming technique, we achieve a stationary point for the formulated problem. Simulations demonstrate that our method reduces the energy consumption as well as the latency, and increases the reliability of LLMs across various mobile settings.
comment: This paper appears in the 2024 International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing (MobiHoc)
Design and validation of a fuzzy logic controller for multi-section continuum robots
The rise of multi-section continuum robots (CRs) has captivated researchers and practitioners across diverse industries and medical fields. Accurate modeling of these dexterous manipulators continues to be a significant challenge. This complexity stems primarily from many nonlinearities that plague their behavior, including hysteresis and cable elongation. Researchers have devised a spectrum of model-based and learning-based strategies to navigate this intricate landscape, aiming to conquer the modeling problem and elevate control performance. Despite the advancements in these approaches, they encounter challenges stemming from their complex design and intricate learning processes, impairing versatility and hindering robust closed-loop control. This paper introduces a simple-structured, model-less fuzzy logic controller for the closed-loop control of continuum robots. Unlike traditional methods relying on complex models and numerous sensors, this controller boasts a built-in shape reconstruction algorithm. This algorithm allows it to achieve robust control using only the feedback of end position and orientation, significantly reducing sensor dependence. It efficiently adapts to various nonlinearities like hysteresis, cable elongation, and unexpected external disturbances. The experimental results conclusively demonstrate the accuracy and robustness of the proposed fuzzy controller. On a three-section, six-degree-of-freedom continuum robot, it achieved a miniscule trajectory tracking Root Mean Square Error (RMSE) from 0.28 to 0.54 mm, representing just 0.17 to 0.32% of the robot's length. Additionally, the controller demonstrates robustness by successfully handling an unexpected external disturbance of 100g during the trajectory tracking.
Advanced Resilience Planning for Distribution Systems
Climate change has led to an increase in the frequency and severity of extreme weather events, posing significant challenges for power distribution systems. In response, this work presents a planning approach in order to enhance the resilience of distribution systems against climatic hazards. The framework systematically addresses uncertainties during extreme events, including weather variability and line damage. Key strategies include line hardening, backup diesel generators, and sectionalizers to strengthen resilience. We model spatio-temporal dynamics and costs through a hybrid model integrating stochastic processes with deterministic elements. A two-stage stochastic mixed-integer linear approach is developed to optimize resilience investments against load loss, generator operations, and repairs. Case studies on the IEEE 15-bus benchmark system and a realistic distribution grid model in Riyadh, Saudi Arabia demonstrate enhanced system robustness as well as cost efficiency of 10% and 15%, respectively.
comment: CIRED Chicago Workshop 2024: Resilience of Electric Distribution Systems
A Parallel-in-Time Newton's Method for Nonlinear Model Predictive Control
Model predictive control (MPC) is a powerful framework for optimal control of dynamical systems. However, MPC solvers suffer from a high computational burden that restricts their application to systems with low sampling frequency. This issue is further amplified in nonlinear and constrained systems that require nesting MPC solvers within iterative procedures. In this paper, we address these issues by developing parallel-in-time algorithms for constrained nonlinear optimization problems that take advantage of massively parallel hardware to achieve logarithmic computational time scaling over the planning horizon. We develop time-parallel second-order solvers based on interior point methods and the alternating direction method of multipliers, leveraging fast convergence and lower computational cost per iteration. The parallelization is based on a reformulation of the subproblems in terms of associative operations that can be parallelized using the associative scan algorithm. We validate our approach on numerical examples of nonlinear and constrained dynamical systems.
Optimal Infinite-Horizon Mixed $\mathit{H}_2/\mathit{H}_\infty$ Control
We study the problem of mixed $\mathit{H}_2/\mathit{H}_\infty$ control in the infinite-horizon setting. We identify the optimal causal controller that minimizes the $\mathit{H}_2$ cost of the closed-loop system subject to an $\mathit{H}_\infty$ constraint. Megretski proved that the optimal mixed $\mathit{H}_2/\mathit{H}_\infty$ controller is non-rational whenever the constraint is active without giving an explicit construction of the controller. In this work, we provide the first exact closed-form solution to the infinite-horizon mixed $\mathit{H}_2/\mathit{H}_\infty$ control in the frequency domain. While the optimal controller is non-rational, our formulation provides a finite-dimensional parameterization of the optimal controller. Leveraging this fact, we introduce an efficient iterative algorithm that finds the optimal causal controller in the frequency domain. We show that this algorithm is convergent when the system is scalar and present numerical evidence for exponential convergence of the proposed algorithm. Finally, we show how to find the best (in $\mathit{H}_\infty$ norm) fixed-order rational approximations of the optimal mixed $\mathit{H}_2/\mathit{H}_\infty$ controller and study its performance.
comment: Accepted for presentation at the 60th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 2024
Numerically Robust Fixed-Point Smoothing Without State Augmentation
Practical implementations of Gaussian smoothing algorithms have received a great deal of attention in the last 60 years. However, almost all work focuses on estimating complete time series (''fixed-interval smoothing'', $\mathcal{O}(K)$ memory) through variations of the Rauch--Tung--Striebel smoother, rarely on estimating the initial states (''fixed-point smoothing'', $\mathcal{O}(1)$ memory). Since fixed-point smoothing is a crucial component of algorithms for dynamical systems with unknown initial conditions, we close this gap by introducing a new formulation of a Gaussian fixed-point smoother. In contrast to prior approaches, our perspective admits a numerically robust Cholesky-based form (without downdates) and avoids state augmentation, which would needlessly inflate the state-space model and reduce the numerical practicality of any fixed-point smoother code. The experiments demonstrate how a JAX implementation of our algorithm matches the runtime of the fastest methods and the robustness of the most robust techniques while existing implementations must always sacrifice one for the other.
Analysis and Modeling of the Hybrid Vessel's Electrical Power System
With the maritime industry poised on the cusp of a hybrid revolution, the design and analysis of advanced vessel systems have become paramount for engineers. This paper presents AC and DC electrical hybrid power system models in ETAP, the simulation software that can be adapted to engineer future hybrid vessels. These models are also a step towards a digital twin model that can help in troubleshooting and preventing issues, reducing risk and engineering time. The testing of the models is focused on time domain analysis, short-circuit currents, and protection \& coordination. The models are based on actual vessels and manufacturer parameters are used where available.
A Screening Method for Power System Inertia Zones Identification
The heterogeneous distribution of frequency support from dispersed renewable generation sources results in varying inertia within the system. The effects of disturbances exhibit non-uniform variations contingent upon the disturbance's location and the affected region's topology and inertia. A screening method for inertia-zone identification is proposed considering the combination of network structure and generator inertia distribution that will aid in comprehending the response of nodes to disturbances. The nodes' dynamic nodal weight (DNW) is defined using maximal entropy random walk that defines each node's spreading power dynamics. Further, a modified weighted kmeans++ clustering technique is proposed using DNW to obtain the equivalent spatial points of each zone and the system to parameterize the inertia status of each zone. The impact of the proposed scheme is justified by simulating a modified IEEE 39 bus system with doubly-fed induction generator (DFIG) integration in the real-time digital simulator.
Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges
The advancement of Large Language Models (LLMs) has significantly impacted various domains, including Web search, healthcare, and software development. However, as these models scale, they become more vulnerable to cybersecurity risks, particularly backdoor attacks. By exploiting the potent memorization capacity of LLMs, adversaries can easily inject backdoors into LLMs by manipulating a small portion of training data, leading to malicious behaviors in downstream applications whenever the hidden backdoor is activated by the pre-defined triggers. Moreover, emerging learning paradigms like instruction tuning and reinforcement learning from human feedback (RLHF) exacerbate these risks as they rely heavily on crowdsourced data and human feedback, which are not fully controlled. In this paper, we present a comprehensive survey of emerging backdoor threats to LLMs that appear during LLM development or inference, and cover recent advancement in both defense and detection strategies for mitigating backdoor threats to LLMs. We also outline key challenges in addressing these threats, highlighting areas for future research.
comment: The 60th Annual Allerton Conference (Invited Paper). The arXiv version is a pre-IEEE Press publication version
Spacecraft Attitude Control Under Reaction Wheel Constraints Using Control Lyapunov and Control Barrier Functions
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control.
Tannenbaum's gain-margin optimization meets Polyak's heavy-ball algorithm
The paper highlights a relatively unknown link between algorithm design in optimization and control synthesis in robust control. Specifically, quadratic optimization can be recast as a regulation problem within the framework of $\mathcal{H}_\infty$ control. From this vantage point, the optimality of Polyak's fastest heavy-ball algorithm can be ascertained as a solution to a gain margin optimization problem. The approach is independent of Polyak's original and brilliant argument, yet simpler, and relies on the foundational work by Tannenbaum that introduced and solved the gain margin optimization via Nevanlinna--Pick interpolation theory. The link between first-order optimization methods and robust control theory sheds new light into limits of algorithmic performance for such methods, and suggests a new framework where similar computational problems can be systematically studied and algorithms optimized. In particular, it raises the question as to whether periodically scheduled algorithms can achieve faster rates for quadratic optimization, in a manner analogous to periodic control that extends gain margin beyond that of time-invariant control. This turns out not to be the case, due to the analytic obstruction of a transmission zero that is inherent in causal optimization algorithms. Interestingly, this obstruction can be removed with implicit algorithms, cast in a similar manner as feedback regulation problems with causal, but not strictly causal dynamics, thereby devoid of the transmission zero at infinity and able to achieve superior convergence rates. The confluence of the fields of optimization algorithms and control provides a frame to tackle questions pertaining to speed, accuracy, distributed computation, and so forth, and to delineate respective limits to performance and tradeoffs in a systematic manner, utilizing the formalism of robust control.
comment: 25 pages, 8 figures
Estimation of Constraint Admissible Invariant Set with Neural Lyapunov Function
Constraint admissible positively invariant (CAPI) sets play a pivotal role in ensuring safety in control and planning applications, such as the recursive feasibility guarantee of explicit reference governor and model predictive control. However, existing methods for finding CAPI sets for nonlinear systems are often limited to single equilibria or specific system dynamics. This limitation underscores the necessity for a method to construct a CAPI set for general reference tracking control and a broader range of systems. In this work, we leverage recent advancements in learning-based methods to derive Lyapunov functions, particularly focusing on those with piecewise-affine activation functions. Previous attempts to find an invariant set with the piecewise-affine neural Lyapunov function have focused on the estimation of the region of attraction with mixed integer programs. We propose a methodology to determine the maximal CAPI set for any reference with the neural Lyapunov function by transforming the problem into multiple linear programs. Additionally, to enhance applicability in real-time control scenarios, we introduce a learning-based approach to train the estimator, which infers the CAPI set from a given reference. The proposed approach is validated with multiple simulations to show that it can generate a valid CAPI set with the given neural Lyapunov functions for any reference. We also employ the proposed CAPI set estimation method in the explicit reference governor and demonstrate its effectiveness for constrained control.
comment: 8 pages, 6 figures, Accepted to 63nd IEEE Conference on Decision and Control (CDC 2024)
A Plug and Play Distributed Secondary Controller for Microgrids with Grid-Forming Inverters
A distributed controller for secondary control problems in microgrids with grid-forming (GFM) inverter-based resources (IBRs) is developed. The controller is based on distributed optimization and is synthesized and implemented distributively enabling each GFM IBR to utilize decentralized measurements and the neighborhood information in the communication network. We present a convergence analysis establishing voltage regulation and reactive power sharing properties. A controller-hardware-in-the-loop experiment is conducted to evaluate the performance of the proposed controller. The experimental results corroborate the efficacy of the proposed distributed controller for secondary control.
comment: 7 pages, 3 figures
A Distributed Malicious Agent Detection Scheme for Resilient Power Apportioning in Microgrids
We consider the framework of distributed aggregation of Distributed Energy Resources (DERs) in power networks to provide ancillary services to the power grid. Existing aggregation schemes work under the assumption of trust and honest behavior of the DERs and can suffer when that is not the case. In this article, we develop a distributed detection scheme that allows the DERs to detect and isolate the maliciously behaving DERs. We propose a model for the maliciously behaving DERs and show that the proposed distributed scheme leads to the detection of the malicious DERs. Further, augmented with the distributed power apportioning algorithm the proposed scheme provides a framework for resilient distributed power apportioning for ancillary service dispatch in power networks. A controller-hardware-in-the-loop (CHIL) experimental setup is developed to evaluate the performance of the proposed resilient distributed power apportioning scheme on an 8-commercial building distribution network (Central Core) connected to a 55 bus distribution network (External Power Network) based on the University of Minnesota Campus. A diversity of DERs and loads are included in the network to generalize the applicability of the framework. The experimental results corroborate the efficacy of the proposed resilient distributed power apportioning for ancillary service dispatch in power networks.
comment: 7 pages, 3 figures
Discrete Distributionally Robust Optimal Control with Explicitly Constrained Optimization
Distributionally robust optimal control (DROC) is gaining interest. This study presents a reformulation method for discrete DROC (DDROC) problems to design optimal control policies under a worst-case distributional uncertainty. The reformulation of DDROC problems impacts both the utility of tractable improvements in continuous DROC problems and the inherent discretization modeling of DROC problems. DROC is believed to have tractability issues; namely, infinite inequalities emerge over the distribution space. Therefore, investigating tractable reformulation methods for these DROC problems is crucial. One such method utilizes the strong dualities of the worst-case expectations. However, previous studies demonstrated that certain non-trivial inequalities remain after the reformulation. To enhance the tractability of DDROC, the proposed method reformulates DDROC problems into one-layer smooth convex programming with only a few trivial inequalities. The proposed method is applied to a DDROC version of a patrol-agent design problem.
comment: 7 pages, 1 figure, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Utilizing Priors in Sampling-based Cost Minimization
We consider an autonomous vehicle (AV) agent performing a long-term cost-minimization problem in the elapsed time $T$ over sequences of states $s_{1:T}$ and actions $a_{1:T}$ for some fixed, known (though potentially learned) cost function $C(s_t,a_t)$, approximate system dynamics $P$, and distribution over initial states $d_0$. The goal is to minimize the expected cost-to-go of the driving trajectory $\tau = s_1, a_1, ..., s_T, a_T$ from the initial state.
Decentralized Input and State Estimation for Multi-agent System with Dynamic Topology and Heterogeneous Sensor Network
A crucial challenge in decentralized systems is state estimation in the presence of unknown inputs, particularly within heterogeneous sensor networks with dynamic topologies. While numerous consensus algorithms have been introduced, they often require extensive information exchange or multiple communication iterations to ensure estimation accuracy. This paper proposes an efficient algorithm that achieves an unbiased and optimal solution comparable to filters with full information about other agents. This is accomplished through the use of information filter decomposition and the fusion of inputs via covariance intersection. Our method requires only a single communication iteration for exchanging individual estimates between agents, instead of multiple rounds of information exchange, thus preserving agents' privacy by avoiding the sharing of explicit observations and system equations. Furthermore, to address the challenges posed by dynamic communication topologies, we propose two practical strategies to handle issues arising from intermittent observations and incomplete state estimation, thereby enhancing the robustness and accuracy of the estimation process. Experiments and ablation studies conducted in both stationary and dynamic environments demonstrate the superiority of our algorithm over other baselines. Notably, it performs as well as, or even better than, algorithms that have a global view of all neighbors.
Quantifying the Dunkelflaute: An analysis of variable renewable energy droughts in Europe
Variable renewable energy droughts, also referred to as "Dunkelflaute", emerge as a challenge for realizing climate-neutral energy systems based on variable wind and solar power. Using data on 38 historic weather years and an advanced identification method, we characterize European drought events for on- and offshore wind power, solar photovoltaics, and policy-relevant renewable technology portfolios. We show that drought characteristics heavily depend on the chosen threshold. Using single thresholds, as common in the literature, is thus not advisable. Applying a multi-threshold framework, we quantify how the complementarity of wind and solar power temporally and spatially alleviates drought frequency, duration, and severity within (portfolio effect) and across countries (balancing effect). We further identify the most extreme droughts and show how these drive major discharging periods of long-duration storage in a fully renewable European energy system. Such events comprise sequences of shorter, contiguous droughts of varying severity. In a perfectly interconnected Europe, the most extreme drought event occurred in winter 1996/97 and lasted 55~days. Yet, the average renewable portfolio availability during this event was still 47% of its long-run mean. As extreme droughts may span across the turn of years, single calendar year planning horizons are not suitable for modeling weather-resilient future energy scenarios.
Koopman Operator in the Weighted Function Spaces and its Learning for the Estimation of Lyapunov and Zubov Functions
The mathematical properties and data-driven learning of the Koopman operator, which represents nonlinear dynamics as a linear mapping on a properly defined functional spaces, have become key problems in nonlinear system identification and control. However, Koopman operators that are approximately learned from snapshot data may not always accurately predict the system evolution on long horizons. In this work, by defining the Koopman operator on a space of weighted continuous functions and learning it on a weighted reproducing kernel Hilbert space, the Koopman operator is guaranteed to be contractive and the accumulation learning error is bounded. The weighting function, assumed to be known a priori, has an exponential decay with the flow or decays exponentially when compensated by an exponential factor. Under such a construction, the Koopman operator learned from data is used to estimate (i) Lyapunov functions for globally asymptotically stable dynamics, and (ii) Zubov-Lyapunov functions that characterize the domain of attraction. For these estimations, probabilistic bounds on the errors are derived.
comment: 8 pages, 3 figures, submitted to 2025 American Control Conference
A Data-Driven Approach To Preserve Safety and Reference Tracking for Constrained Cyber-Physical Systems Under Network Attacks
This paper proposes a worst-case data-driven control architecture capable of ensuring the safety of constrained Cyber-Physical Systems under cyber-attacks while minimizing, whenever possible, potential degradation in tracking performance. To this end, a data-driven robust anomaly detector is designed to detect cyber-attack occurrences. Moreover, an add-on tracking supervisor module allows safe open-loop tracking control operations in case of unreliable measurements. On the plant side, a safety verification module and a local emergency controller are designed to manage severe attack scenarios that cannot be handled on the controller's side. These two modules resort to worst-case reachability and controllability data-driven arguments to detect potential unsafe scenarios and replace, whenever strictly needed, the tracking controller with emergency actions whose objective is to steer the plant's state trajectory in a predefined set of admissible and safe robust control invariant region until an attack-free scenario is restored. The effectiveness of the proposed solution has been shown through a simulation example.
comment: Preprint of a journal manuscript submitted to the IEEE Transactions on Automatic Control
Analysis of human steering behavior differences in human-in-control and autonomy-in-control driving
Steering models (such as the generalized two-point model) predict human steering behavior well when the human is in direct control of a vehicle. In vehicles under autonomous control, human control inputs are not used; rather, an autonomous controller applies steering and acceleration commands to the vehicle. For example, human steering input may be used for state estimation rather than direct control. We show that human steering behavior changes when the human no longer directly controls the vehicle and the two are instead working in a shared autonomy paradigm. Thus, when a vehicle is not under direct human control, steering models like the generalized two-point model do not predict human steering behavior. We also show that the error between predicted human steering behavior and actual human steering behavior reflects a fundamental difference when the human directly controls the vehicle compared to when the vehicle is autonomously controlled. Moreover, we show that a single distribution describes the error between predicted human steering behavior and actual human steering behavior when the human's steering inputs are used for state estimation and the vehicle is autonomously controlled, indicating there may be a underlying model for human steering behavior under this type of shared autonomous control. Future work includes determining this shared autonomous human steering model and demonstrating its performance.
comment: 6 pages, 10 figures, accepted for publication at the 5th IFAC at the 5th IFAC Workshop on Cyber-Physical Human Systems
Constraint-Aware Refinement for Safety Verification of Neural Feedback Loops
Neural networks (NNs) are becoming increasingly popular in the design of control pipelines for autonomous systems. However, since the performance of NNs can degrade in the presence of out-of-distribution data or adversarial attacks, systems that have NNs in their control pipelines, i.e., neural feedback loops (NFLs), need safety assurances before they can be applied in safety-critical situations. Reachability analysis offers a solution to this problem by calculating reachable sets that bound the possible future states of an NFL and can be checked against dangerous regions of the state space to verify that the system does not violate safety constraints. Since exact reachable sets are generally intractable to calculate, reachable set over approximations (RSOAs) are typically used. The problem with RSOAs is that they can be overly conservative, making it difficult to verify the satisfaction of safety constraints, especially over long time horizons or for highly nonlinear NN control policies. Refinement strategies such as partitioning or symbolic propagation are typically used to limit the conservativeness of RSOAs, but these approaches come with a high computational cost and often can only be used to verify safety for simple reachability problems. This paper presents Constraint-Aware Refinement for Verification (CARV): an efficient refinement strategy that reduces the conservativeness of RSOAs by explicitly using the safety constraints on the NFL to refine RSOAs only where necessary. We demonstrate that CARV can verify the safety of an NFL where other approaches either fail or take up to 60x longer and 40x the memory.
comment: 6 pages, 10 figures, submitted to L-CSS/ACC
Seasonal Performance Evaluation of a Hybrid PV-Wind-Battery Power System for a Mars Base
This work investigates a hybrid photovoltaic-wind-battery power system designed to sustain a Mars base under varying seasonal and climatic conditions. The Mars Climate Database was utilized to simulate the effects of seasonal changes, diurnal cycles, and dust storms on the system's power generation. The seasonal performance was analyzed across the Martian surface and at potential habitation sites proposed in the "First Landing Site/Exploration Zone Workshop for Human Missions to the Surface of Mars (FLSW).'' Within the hybrid system, the photovoltaic arrays serve as the primary energy source, with wind turbines providing essential backup during nighttime and dust storms. A single $1\,000\,\mathrm{m}^2$ photovoltaic array, a $33.4\,\mathrm{m}$ diameter wind turbine, and a $312\,\mathrm{kWh}$ battery can support a six-person Mars base at $32.1\%$ of the Martian surface during the equinoxes and solstices, expanding to $51.7\%$ with three sets of arrays and turbines. Additionally, $24$ FLSW sites can be supported throughout the solstices and equinoxes by a single photovoltaic array, turbine, and battery, even during global dust storms. Among the $24$ sites, Hebrus Valles, Huygens Crater, and Noctis Labyrinthus had the highest energy production potential. These findings are expected to guide further research on hybrid renewable power systems for Mars exploration.
comment: The peer-reviewed paper will be presented at The 2024 International Conference on Electric Power and Energy Conversion Systems (EPECS). The data used in this work are available from https://github.com/AbdollahMasoud/EPECS-2024
PREPARE: PREdicting PAndemic's REcurring Waves Amidst Mutations, Vaccination, and Lockdowns
This study releases an adaptable framework that can provide insights to policymakers to predict the complex recurring waves of the pandemic in the medium postemergence of the virus spread, a phase marked by rapidly changing factors like virus mutations, lockdowns, and vaccinations, offering a way to forecast infection trends and stay ahead of future outbreaks even amidst uncertainty. The proposed model is validated on data from COVID-19 spread in Germany.
Stochastic Opinion Dynamics under Social Pressure in Arbitrary Networks
Social pressure is a key factor affecting the evolution of opinions on networks in many types of settings, pushing people to conform to their neighbors' opinions. To study this, the interacting Polya urn model was introduced by Jadbabaie et al., in which each agent has two kinds of opinion: inherent beliefs, which are hidden from the other agents and fixed; and declared opinions, which are randomly sampled at each step from a distribution which depends on the agent's inherent belief and her neighbors' past declared opinions (the social pressure component), and which is then communicated to her neighbors. Each agent also has a bias parameter denoting her level of resistance to social pressure. At every step, each agent updates her declared opinion (simultaneously with all other agents) according to her neighbors' aggregate past declared opinions, her inherent belief, and her bias parameter. We study the asymptotic behavior of this opinion dynamics model and show that the agents' declaration probabilities approaches a set of equilibrium points of the expected dynamics using Lyapunov theory and stochastic approximation techniques. We also derive necessary and sufficient conditions for the agents to approach consensus on their declared opinions. Our work provides further insight into the difficulty of inferring the inherent beliefs of agents when they are under social pressure.
comment: Updated cited theorems (and proofs included)
Efficient Path Planning in Large Unknown Environments with Switchable System Models for Automated Vehicles
Large environments are challenging for path planning algorithms as the size of the configuration space increases. Furthermore, if the environment is mainly unexplored, large amounts of the path are planned through unknown areas. Hence, a complete replanning of the entire path occurs whenever the path collides with newly discovered obstacles. We propose a novel method that stops the path planning algorithm after a certain distance. It is used to navigate the algorithm in large environments and is not prone to problems of existing navigation approaches. Furthermore, we developed a method to detect significant environment changes to allow a more efficient replanning. At last, we extend the path planner to be used in the U-Shift concept vehicle. It can switch to another system model and rotate around the center of its rear axis. The results show that the proposed methods generate nearly identical paths compared to the standard Hybrid A* while drastically reducing the execution time. Furthermore, we show that the extended path planning algorithm enables the efficient use of the maneuvering capabilities of the concept vehicle to plan concise paths in narrow environments.
Experimenting with Adaptive Bitrate Algorithms for Virtual Reality Streaming over Wi-Fi
Interactive Virtual Reality (VR) streaming over Wi-Fi networks encounters significant challenges due to bandwidth fluctuations caused by channel contention and user mobility. Adaptive BitRate (ABR) algorithms dynamically adjust the video encoding bitrate based on the available network capacity, aiming to maximize image quality while mitigating congestion and preserving the user's Quality of Experience (QoE). In this paper, we experiment with ABR algorithms for VR streaming using Air Light VR (ALVR), an open-source VR streaming solution. We extend ALVR with a comprehensive set of metrics that provide a robust characterization of the network's state, enabling more informed bitrate adjustments. To demonstrate the utility of these performance indicators, we develop and test the Network-aware Step-wise ABR algorithm for VR streaming (NeSt-VR). Results validate the accuracy of the newly implemented network performance metrics and demonstrate NeSt-VR's video bitrate adaptation capabilities.
A Hypergraph Approach to Distributed Broadcast
This paper explores the distributed broadcast problem within the context of network communications, a critical challenge in decentralized information dissemination. We put forth a novel hypergraph-based approach to address this issue, focusing on minimizing the number of broadcasts to ensure comprehensive data sharing among all network users. The key contributions of this work include the establishment of a general lower bound for the problem using the min-cut capacity of hypergraphs, and a distributed broadcast for quasi-trees (DBQT) algorithm tailored for the unique structure of quasi-trees, which is proven to be optimal. This paper advances both network communication strategies and hypergraph theory, with implications for a wide range of real-world applications, from vehicular and sensor networks to distributed storage systems.
Machine Learning for Equitable Load Shedding: Real-time Solution via Learning Binding Constraints
Timely and effective load shedding in power systems is critical for maintaining supply-demand balance and preventing cascading blackouts. To eliminate load shedding bias against specific regions in the system, optimization-based methods are uniquely positioned to help balance between economical and equity considerations. However, the resulting optimization problem involves complex constraints, which can be time-consuming to solve and thus cannot meet the real-time requirements of load shedding. To tackle this challenge, in this paper we present an efficient machine learning algorithm to enable millisecond-level computation for the optimization-based load shedding problem. Numerical studies on both a 3-bus toy example and a realistic RTS-GMLC system have demonstrated the validity and efficiency of the proposed algorithm for delivering equitable and real-time load shedding decisions.
RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged Locomotion
This paper presents a control framework that combines model-based optimal control and reinforcement learning (RL) to achieve versatile and robust legged locomotion. Our approach enhances the RL training process by incorporating on-demand reference motions generated through finite-horizon optimal control, covering a broad range of velocities and gaits. These reference motions serve as targets for the RL policy to imitate, leading to the development of robust control policies that can be learned with reliability. Furthermore, by utilizing realistic simulation data that captures whole-body dynamics, RL effectively overcomes the inherent limitations in reference motions imposed by modeling simplifications. We validate the robustness and controllability of the RL training process within our framework through a series of experiments. In these experiments, our method showcases its capability to generalize reference motions and effectively handle more complex locomotion tasks that may pose challenges for the simplified model, thanks to RL's flexibility. Additionally, our framework effortlessly supports the training of control policies for robots with diverse dimensions, eliminating the necessity for robot-specific adjustments in the reward function and hyperparameters.
comment: The paper has been accepted for publication in IEEE Robotics and Automation Letters (RA-L). You can find the copyright information on the front page of the paper. The supplementary video is available in https://www.youtube.com/watch?v=qPttVfzGS84
Safety Control of Uncertain MIMO Systems Using Dynamic Output Feedback Barrier Pairs
Safety control of dynamical systems using barrier functions relies on knowing the full state information. This paper introduces a novel approach for safety control in uncertain MIMO systems with partial state information. The proposed method combines the synthesis of a vector norm barrier function and a dynamic output feedback safety controller to ensure robust safety enforcement. The safety controller guarantees the invariance of the barrier function under uncertain dynamics and disturbances. To address the challenges associated with safety verification using partial state information, a barrier function estimator is developed. This estimator employs an identifier-based state estimator to obtain a state estimate that is affine in the uncertain model parameters of the system. By incorporating a priori knowledge of the limits of the uncertain model parameters and disturbances, the state estimate provides a robust upper bound for the barrier function. Comparative analysis with existing control barrier function based methods shows the advantage of the proposed approach in enforcing safety constraints under tight input constraints and the utilization of estimated state information.
Market Implications of Alternative Operating Reserve Modeling in Wholesale Electricity Markets
Pricing and settlement mechanisms are crucial for efficient re-source allocation, investment incentives, market competition, and regulatory oversight. In the United States, Regional Transmission Operators (RTOs) adopts a uniform pricing scheme that hinges on the marginal costs of supplying additional electricity. This study investigates the pricing and settlement impacts of alternative reserve constraint modeling, highlighting how even slight variations in the modeling of constraints can drastically alter market clearing prices, reserve quantities, and revenue outcomes. Focusing on the diverse market designs and assumptions in ancillary services by U.S. RTOs, particularly in relation to capacity sharing and reserve substitutions, the research examines four distinct models that combine these elements based on a large-scale synthetic power system test data. Our study provides a critical insight into the economic implications and the underlying factors of these alternative reserve constraints through market simulations and data analysis.
Systems and Control (EESS)
Continuously Improving Mobile Manipulation with Autonomous Real-World RL
We present a fully autonomous real-world RL framework for mobile manipulation that can learn policies without extensive instrumentation or human supervision. This is enabled by 1) task-relevant autonomy, which guides exploration towards object interactions and prevents stagnation near goal states, 2) efficient policy learning by leveraging basic task knowledge in behavior priors, and 3) formulating generic rewards that combine human-interpretable semantic information with low-level, fine-grained observations. We demonstrate that our approach allows Spot robots to continually improve their performance on a set of four challenging mobile manipulation tasks, obtaining an average success rate of 80% across tasks, a 3-4 improvement over existing approaches. Videos can be found at https://continual-mobile-manip.github.io/
comment: CoRL 2024. Website at https://continual-mobile-manip.github.io/
Visual collective behaviors on spherical robots
The implementation of collective motion, traditionally, disregard the limited sensing capabilities of an individual, to instead assuming an omniscient perception of the environment. This study implements a visual flocking model in a ``robot-in-the-loop'' approach to reproduce these behaviors with a flock composed of 10 independent spherical robots. The model achieves robotic collective motion by only using panoramic visual information of each robot, such as retinal position, optical size and optic flow of the neighboring robots. We introduce a virtual anchor to confine the collective robotic movements so to avoid wall interactions. For the first time, a simple visual robot-in-the-loop approach succeed in reproducing several collective motion phases, in particular, swarming, and milling. Another milestone achieved with by this model is bridging the gap between simulation and physical experiments by demonstrating nearly identical behaviors in both environments with the same visual model. To conclude, we show that our minimal visual collective motion model is sufficient to recreate most collective behaviors on a robot-in-the-loop system that is scalable, behaves as numerical simulations predict and is easily comparable to traditional models.
comment: 26 pages, 16 figures, journal bioinspired and biomimetics
Formally Verified Physics-Informed Neural Control Lyapunov Functions
Control Lyapunov functions are a central tool in the design and analysis of stabilizing controllers for nonlinear systems. Constructing such functions, however, remains a significant challenge. In this paper, we investigate physics-informed learning and formal verification of neural network control Lyapunov functions. These neural networks solve a transformed Hamilton-Jacobi-Bellman equation, augmented by data generated using Pontryagin's maximum principle. Similar to how Zubov's equation characterizes the domain of attraction for autonomous systems, this equation characterizes the null-controllability set of a controlled system. This principled learning of neural network control Lyapunov functions outperforms alternative approaches, such as sum-of-squares and rational control Lyapunov functions, as demonstrated by numerical examples. As an intermediate step, we also present results on the formal verification of quadratic control Lyapunov functions, which, aided by satisfiability modulo theories solvers, can perform surprisingly well compared to more sophisticated approaches and efficiently produce global certificates of null-controllability.
Quantifying Metrics for Wildfire Ignition Risk from Geographic Data in Power Shutoff Decision-Making
Faults on power lines and other electric equipment are known to cause wildfire ignitions. To mitigate the threat of wildfire ignitions from electric power infrastructure, many utilities preemptively de-energize power lines, which may result in power shutoffs. Data regarding wildfire ignition risks are key inputs for effective planning of power line de-energizations. However, there are multiple ways to formulate risk metrics that spatially aggregate wildfire risk map data, and there are different ways of leveraging this data to make decisions. The key contribution of this paper is to define and compare the results of employing six metrics for quantifying the wildfire ignition risks of power lines from risk maps, considering both threshold- and optimization-based methods for planning power line de-energizations. The numeric results use the California Test System (CATS), a large-scale synthetic grid model with power line corridors accurately representing California infrastructure, in combination with real Wildland Fire Potential Index data for a full year. This is the first application of optimal power shutoff planning on such a large and realistic test case. Our results show that the choice of risk metric significantly impacts the lines that are de-energized and the resulting load shed. We find that the optimization-based method results in significantly less load shed than the threshold-based method while achieving the same risk reduction.
A simple controller design to achieve iso-damping robustness: Non-iterative data-driven approach based on fractional-order reference model
This study proposes a simple controller design approach to achieve a class of robustness, the so-called iso-damping property. The proposed approach can be executed using only one-shot input/output data. An accurate mathematical model of a controlled plant is not required. The model-reference control problem is defined to achieve the desired closed-loop specifications, including the iso-damping, and the reference model is designed on the basis of fractional-order calculus. The optimization problem for the model-reference control is formulated using the one-shot input/output data while considering the bounded-input bounded-output (BIBO) stability from a bounded reference input to a bounded output. The iso-damping robust controller is obtained by solving the optimization problem. The representative advantages of the proposed approach over the conventional methods are the simplicity, practicality, and reliability from the viewpoint of the unnecessity of the plant model and explicit consideration of the BIBO stability from a bounded reference input to a bounded output. Numerical examples demonstrate the validity of the proposed approach.
Design, manufacturing, and inverse dynamic modeling of soft parallel robots actuated by dielectric elastomer actuators
Soft parallel robots with their manipulation safety and low commercial cost show a promising future for delicate operations and safe human-robot interactions. However, promoting the use of electroactive polymers (EAPs) is still challenging due to the under-improving quality of the product and the dynamic modelling of the collaborations between multiple actuators. This article presents the design, fabrication, modelling and control of a parallel kinematics Delta robot actuated by dielectric elastomer actuators (DEAs). The trade-off between the actuation force and stroke is retaken by an angular stroke amplification mechanism, and the weight of the robot frame is reduced by utilizing 3D puzzling strip structures. A generic way of constructing a high-stability conductive paint on a silicon-based film has been achieved by laser scanning the DE-film and then sandwiching a conductive particle-based electrode with a paint which is mixed by the particles and photosensitive resin. Compared to the wildly used carbon grease, the fabricated electrode shows a higher consistency in its dynamic behaviour before and after the on-stand test. Finally, to predict the output force and inverse motion of the robot end effector, we constructed the inverse dynamic model by introducing an expanded Bergstrom-Boyce model to the constitutive behavior of the dielectric film. The experimental results show a prediction of robot output force with RSME of 12.4% when the end effector remains stationary, and a well-followed trajectory with less than RSME 2.5%.
comment: 17 pages, 12 figures
Controlling sharpness, SNR and SAR for 3D FSE at 7T by end-to-end learning
Purpose: To non-heuristically identify dedicated variable flip angle (VFA) schemes optimized for the point-spread function (PSF) and signal-to-noise ratio (SNR) of multiple tissues in 3D FSE sequences with very long echo trains at 7T. Methods: The proposed optimization considers predefined SAR constraints and target contrast using an end-to-end learning framework. The cost function integrates components for contrast fidelity (SNR) and a penalty term to minimize image blurring (PSF) for multiple tissues. By adjusting the weights of PSF/SNR cost-function components, PSF- and SNR-optimized VFAs were derived and tested in vivo using both the open-source Pulseq standard on two volunteers as well as vendor protocols on a 7T MRI system with parallel transmit extension on three volunteers. Results: PSF-optimized VFAs resulted in significantly reduced image blurring compared to standard VFAs for T2w while maintaining contrast fidelity. Small white and gray matter structures, as well as blood vessels, are more visible with PSF-optimized VFAs. Quantitative analysis shows that the optimized VFA yields 50% less deviation from a sinc-like reference PSF than the standard VFA. The SNR-optimized VFAs yielded images with significantly improved SNR in a white and gray matter region relative to standard (81.2\pm18.4 vs. 41.2\pm11.5, respectively) as trade-off for elevated image blurring. Conclusion: This study demonstrates the potential of end-to-end learning frameworks to optimize VFA schemes in very long echo trains for 3D FSE acquisition at 7T in terms of PSF and SNR. It paves the way for fast and flexible adjustment of the trade-off between PSF and SNR for 3D FSE.
comment: Submitted to Magnetic Resonance in Medicine for peer-review
Resource Allocation for Stable LLM Training in Mobile Edge Computing
As mobile devices increasingly become focal points for advanced applications, edge computing presents a viable solution to their inherent computational limitations, particularly in deploying large language models (LLMs). However, despite the advancements in edge computing, significant challenges remain in efficient training and deploying LLMs due to the computational demands and data privacy concerns associated with these models. This paper explores a collaborative training framework that integrates mobile users with edge servers to optimize resource allocation, thereby enhancing both performance and efficiency. Our approach leverages parameter-efficient fine-tuning (PEFT) methods, allowing mobile users to adjust the initial layers of the LLM while edge servers handle the more demanding latter layers. Specifically, we formulate a multi-objective optimization problem to minimize the total energy consumption and delay during training. We also address the common issue of instability in model performance by incorporating stability enhancements into our objective function. Through novel fractional programming technique, we achieve a stationary point for the formulated problem. Simulations demonstrate that our method reduces the energy consumption as well as the latency, and increases the reliability of LLMs across various mobile settings.
comment: This paper appears in the 2024 International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing (MobiHoc)
Design and validation of a fuzzy logic controller for multi-section continuum robots
The rise of multi-section continuum robots (CRs) has captivated researchers and practitioners across diverse industries and medical fields. Accurate modeling of these dexterous manipulators continues to be a significant challenge. This complexity stems primarily from many nonlinearities that plague their behavior, including hysteresis and cable elongation. Researchers have devised a spectrum of model-based and learning-based strategies to navigate this intricate landscape, aiming to conquer the modeling problem and elevate control performance. Despite the advancements in these approaches, they encounter challenges stemming from their complex design and intricate learning processes, impairing versatility and hindering robust closed-loop control. This paper introduces a simple-structured, model-less fuzzy logic controller for the closed-loop control of continuum robots. Unlike traditional methods relying on complex models and numerous sensors, this controller boasts a built-in shape reconstruction algorithm. This algorithm allows it to achieve robust control using only the feedback of end position and orientation, significantly reducing sensor dependence. It efficiently adapts to various nonlinearities like hysteresis, cable elongation, and unexpected external disturbances. The experimental results conclusively demonstrate the accuracy and robustness of the proposed fuzzy controller. On a three-section, six-degree-of-freedom continuum robot, it achieved a miniscule trajectory tracking Root Mean Square Error (RMSE) from 0.28 to 0.54 mm, representing just 0.17 to 0.32% of the robot's length. Additionally, the controller demonstrates robustness by successfully handling an unexpected external disturbance of 100g during the trajectory tracking.
Advanced Resilience Planning for Distribution Systems
Climate change has led to an increase in the frequency and severity of extreme weather events, posing significant challenges for power distribution systems. In response, this work presents a planning approach in order to enhance the resilience of distribution systems against climatic hazards. The framework systematically addresses uncertainties during extreme events, including weather variability and line damage. Key strategies include line hardening, backup diesel generators, and sectionalizers to strengthen resilience. We model spatio-temporal dynamics and costs through a hybrid model integrating stochastic processes with deterministic elements. A two-stage stochastic mixed-integer linear approach is developed to optimize resilience investments against load loss, generator operations, and repairs. Case studies on the IEEE 15-bus benchmark system and a realistic distribution grid model in Riyadh, Saudi Arabia demonstrate enhanced system robustness as well as cost efficiency of 10% and 15%, respectively.
comment: CIRED Chicago Workshop 2024: Resilience of Electric Distribution Systems
A Parallel-in-Time Newton's Method for Nonlinear Model Predictive Control
Model predictive control (MPC) is a powerful framework for optimal control of dynamical systems. However, MPC solvers suffer from a high computational burden that restricts their application to systems with low sampling frequency. This issue is further amplified in nonlinear and constrained systems that require nesting MPC solvers within iterative procedures. In this paper, we address these issues by developing parallel-in-time algorithms for constrained nonlinear optimization problems that take advantage of massively parallel hardware to achieve logarithmic computational time scaling over the planning horizon. We develop time-parallel second-order solvers based on interior point methods and the alternating direction method of multipliers, leveraging fast convergence and lower computational cost per iteration. The parallelization is based on a reformulation of the subproblems in terms of associative operations that can be parallelized using the associative scan algorithm. We validate our approach on numerical examples of nonlinear and constrained dynamical systems.
Optimal Infinite-Horizon Mixed $\mathit{H}_2/\mathit{H}_\infty$ Control
We study the problem of mixed $\mathit{H}_2/\mathit{H}_\infty$ control in the infinite-horizon setting. We identify the optimal causal controller that minimizes the $\mathit{H}_2$ cost of the closed-loop system subject to an $\mathit{H}_\infty$ constraint. Megretski proved that the optimal mixed $\mathit{H}_2/\mathit{H}_\infty$ controller is non-rational whenever the constraint is active without giving an explicit construction of the controller. In this work, we provide the first exact closed-form solution to the infinite-horizon mixed $\mathit{H}_2/\mathit{H}_\infty$ control in the frequency domain. While the optimal controller is non-rational, our formulation provides a finite-dimensional parameterization of the optimal controller. Leveraging this fact, we introduce an efficient iterative algorithm that finds the optimal causal controller in the frequency domain. We show that this algorithm is convergent when the system is scalar and present numerical evidence for exponential convergence of the proposed algorithm. Finally, we show how to find the best (in $\mathit{H}_\infty$ norm) fixed-order rational approximations of the optimal mixed $\mathit{H}_2/\mathit{H}_\infty$ controller and study its performance.
comment: Accepted for presentation at the 60th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 2024
Numerically Robust Fixed-Point Smoothing Without State Augmentation
Practical implementations of Gaussian smoothing algorithms have received a great deal of attention in the last 60 years. However, almost all work focuses on estimating complete time series (''fixed-interval smoothing'', $\mathcal{O}(K)$ memory) through variations of the Rauch--Tung--Striebel smoother, rarely on estimating the initial states (''fixed-point smoothing'', $\mathcal{O}(1)$ memory). Since fixed-point smoothing is a crucial component of algorithms for dynamical systems with unknown initial conditions, we close this gap by introducing a new formulation of a Gaussian fixed-point smoother. In contrast to prior approaches, our perspective admits a numerically robust Cholesky-based form (without downdates) and avoids state augmentation, which would needlessly inflate the state-space model and reduce the numerical practicality of any fixed-point smoother code. The experiments demonstrate how a JAX implementation of our algorithm matches the runtime of the fastest methods and the robustness of the most robust techniques while existing implementations must always sacrifice one for the other.
Analysis and Modeling of the Hybrid Vessel's Electrical Power System
With the maritime industry poised on the cusp of a hybrid revolution, the design and analysis of advanced vessel systems have become paramount for engineers. This paper presents AC and DC electrical hybrid power system models in ETAP, the simulation software that can be adapted to engineer future hybrid vessels. These models are also a step towards a digital twin model that can help in troubleshooting and preventing issues, reducing risk and engineering time. The testing of the models is focused on time domain analysis, short-circuit currents, and protection \& coordination. The models are based on actual vessels and manufacturer parameters are used where available.
A Screening Method for Power System Inertia Zones Identification
The heterogeneous distribution of frequency support from dispersed renewable generation sources results in varying inertia within the system. The effects of disturbances exhibit non-uniform variations contingent upon the disturbance's location and the affected region's topology and inertia. A screening method for inertia-zone identification is proposed considering the combination of network structure and generator inertia distribution that will aid in comprehending the response of nodes to disturbances. The nodes' dynamic nodal weight (DNW) is defined using maximal entropy random walk that defines each node's spreading power dynamics. Further, a modified weighted kmeans++ clustering technique is proposed using DNW to obtain the equivalent spatial points of each zone and the system to parameterize the inertia status of each zone. The impact of the proposed scheme is justified by simulating a modified IEEE 39 bus system with doubly-fed induction generator (DFIG) integration in the real-time digital simulator.
Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges
The advancement of Large Language Models (LLMs) has significantly impacted various domains, including Web search, healthcare, and software development. However, as these models scale, they become more vulnerable to cybersecurity risks, particularly backdoor attacks. By exploiting the potent memorization capacity of LLMs, adversaries can easily inject backdoors into LLMs by manipulating a small portion of training data, leading to malicious behaviors in downstream applications whenever the hidden backdoor is activated by the pre-defined triggers. Moreover, emerging learning paradigms like instruction tuning and reinforcement learning from human feedback (RLHF) exacerbate these risks as they rely heavily on crowdsourced data and human feedback, which are not fully controlled. In this paper, we present a comprehensive survey of emerging backdoor threats to LLMs that appear during LLM development or inference, and cover recent advancement in both defense and detection strategies for mitigating backdoor threats to LLMs. We also outline key challenges in addressing these threats, highlighting areas for future research.
comment: The 60th Annual Allerton Conference (Invited Paper). The arXiv version is a pre-IEEE Press publication version
Spacecraft Attitude Control Under Reaction Wheel Constraints Using Control Lyapunov and Control Barrier Functions
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control.
Tannenbaum's gain-margin optimization meets Polyak's heavy-ball algorithm
The paper highlights a relatively unknown link between algorithm design in optimization and control synthesis in robust control. Specifically, quadratic optimization can be recast as a regulation problem within the framework of $\mathcal{H}_\infty$ control. From this vantage point, the optimality of Polyak's fastest heavy-ball algorithm can be ascertained as a solution to a gain margin optimization problem. The approach is independent of Polyak's original and brilliant argument, yet simpler, and relies on the foundational work by Tannenbaum that introduced and solved the gain margin optimization via Nevanlinna--Pick interpolation theory. The link between first-order optimization methods and robust control theory sheds new light into limits of algorithmic performance for such methods, and suggests a new framework where similar computational problems can be systematically studied and algorithms optimized. In particular, it raises the question as to whether periodically scheduled algorithms can achieve faster rates for quadratic optimization, in a manner analogous to periodic control that extends gain margin beyond that of time-invariant control. This turns out not to be the case, due to the analytic obstruction of a transmission zero that is inherent in causal optimization algorithms. Interestingly, this obstruction can be removed with implicit algorithms, cast in a similar manner as feedback regulation problems with causal, but not strictly causal dynamics, thereby devoid of the transmission zero at infinity and able to achieve superior convergence rates. The confluence of the fields of optimization algorithms and control provides a frame to tackle questions pertaining to speed, accuracy, distributed computation, and so forth, and to delineate respective limits to performance and tradeoffs in a systematic manner, utilizing the formalism of robust control.
comment: 25 pages, 8 figures
Estimation of Constraint Admissible Invariant Set with Neural Lyapunov Function
Constraint admissible positively invariant (CAPI) sets play a pivotal role in ensuring safety in control and planning applications, such as the recursive feasibility guarantee of explicit reference governor and model predictive control. However, existing methods for finding CAPI sets for nonlinear systems are often limited to single equilibria or specific system dynamics. This limitation underscores the necessity for a method to construct a CAPI set for general reference tracking control and a broader range of systems. In this work, we leverage recent advancements in learning-based methods to derive Lyapunov functions, particularly focusing on those with piecewise-affine activation functions. Previous attempts to find an invariant set with the piecewise-affine neural Lyapunov function have focused on the estimation of the region of attraction with mixed integer programs. We propose a methodology to determine the maximal CAPI set for any reference with the neural Lyapunov function by transforming the problem into multiple linear programs. Additionally, to enhance applicability in real-time control scenarios, we introduce a learning-based approach to train the estimator, which infers the CAPI set from a given reference. The proposed approach is validated with multiple simulations to show that it can generate a valid CAPI set with the given neural Lyapunov functions for any reference. We also employ the proposed CAPI set estimation method in the explicit reference governor and demonstrate its effectiveness for constrained control.
comment: 8 pages, 6 figures, Accepted to 63nd IEEE Conference on Decision and Control (CDC 2024)
A Plug and Play Distributed Secondary Controller for Microgrids with Grid-Forming Inverters
A distributed controller for secondary control problems in microgrids with grid-forming (GFM) inverter-based resources (IBRs) is developed. The controller is based on distributed optimization and is synthesized and implemented distributively enabling each GFM IBR to utilize decentralized measurements and the neighborhood information in the communication network. We present a convergence analysis establishing voltage regulation and reactive power sharing properties. A controller-hardware-in-the-loop experiment is conducted to evaluate the performance of the proposed controller. The experimental results corroborate the efficacy of the proposed distributed controller for secondary control.
comment: 7 pages, 3 figures
A Distributed Malicious Agent Detection Scheme for Resilient Power Apportioning in Microgrids
We consider the framework of distributed aggregation of Distributed Energy Resources (DERs) in power networks to provide ancillary services to the power grid. Existing aggregation schemes work under the assumption of trust and honest behavior of the DERs and can suffer when that is not the case. In this article, we develop a distributed detection scheme that allows the DERs to detect and isolate the maliciously behaving DERs. We propose a model for the maliciously behaving DERs and show that the proposed distributed scheme leads to the detection of the malicious DERs. Further, augmented with the distributed power apportioning algorithm the proposed scheme provides a framework for resilient distributed power apportioning for ancillary service dispatch in power networks. A controller-hardware-in-the-loop (CHIL) experimental setup is developed to evaluate the performance of the proposed resilient distributed power apportioning scheme on an 8-commercial building distribution network (Central Core) connected to a 55 bus distribution network (External Power Network) based on the University of Minnesota Campus. A diversity of DERs and loads are included in the network to generalize the applicability of the framework. The experimental results corroborate the efficacy of the proposed resilient distributed power apportioning for ancillary service dispatch in power networks.
comment: 7 pages, 3 figures
Discrete Distributionally Robust Optimal Control with Explicitly Constrained Optimization
Distributionally robust optimal control (DROC) is gaining interest. This study presents a reformulation method for discrete DROC (DDROC) problems to design optimal control policies under a worst-case distributional uncertainty. The reformulation of DDROC problems impacts both the utility of tractable improvements in continuous DROC problems and the inherent discretization modeling of DROC problems. DROC is believed to have tractability issues; namely, infinite inequalities emerge over the distribution space. Therefore, investigating tractable reformulation methods for these DROC problems is crucial. One such method utilizes the strong dualities of the worst-case expectations. However, previous studies demonstrated that certain non-trivial inequalities remain after the reformulation. To enhance the tractability of DDROC, the proposed method reformulates DDROC problems into one-layer smooth convex programming with only a few trivial inequalities. The proposed method is applied to a DDROC version of a patrol-agent design problem.
comment: 7 pages, 1 figure, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Utilizing Priors in Sampling-based Cost Minimization
We consider an autonomous vehicle (AV) agent performing a long-term cost-minimization problem in the elapsed time $T$ over sequences of states $s_{1:T}$ and actions $a_{1:T}$ for some fixed, known (though potentially learned) cost function $C(s_t,a_t)$, approximate system dynamics $P$, and distribution over initial states $d_0$. The goal is to minimize the expected cost-to-go of the driving trajectory $\tau = s_1, a_1, ..., s_T, a_T$ from the initial state.
Decentralized Input and State Estimation for Multi-agent System with Dynamic Topology and Heterogeneous Sensor Network
A crucial challenge in decentralized systems is state estimation in the presence of unknown inputs, particularly within heterogeneous sensor networks with dynamic topologies. While numerous consensus algorithms have been introduced, they often require extensive information exchange or multiple communication iterations to ensure estimation accuracy. This paper proposes an efficient algorithm that achieves an unbiased and optimal solution comparable to filters with full information about other agents. This is accomplished through the use of information filter decomposition and the fusion of inputs via covariance intersection. Our method requires only a single communication iteration for exchanging individual estimates between agents, instead of multiple rounds of information exchange, thus preserving agents' privacy by avoiding the sharing of explicit observations and system equations. Furthermore, to address the challenges posed by dynamic communication topologies, we propose two practical strategies to handle issues arising from intermittent observations and incomplete state estimation, thereby enhancing the robustness and accuracy of the estimation process. Experiments and ablation studies conducted in both stationary and dynamic environments demonstrate the superiority of our algorithm over other baselines. Notably, it performs as well as, or even better than, algorithms that have a global view of all neighbors.
Quantifying the Dunkelflaute: An analysis of variable renewable energy droughts in Europe
Variable renewable energy droughts, also referred to as "Dunkelflaute", emerge as a challenge for realizing climate-neutral energy systems based on variable wind and solar power. Using data on 38 historic weather years and an advanced identification method, we characterize European drought events for on- and offshore wind power, solar photovoltaics, and policy-relevant renewable technology portfolios. We show that drought characteristics heavily depend on the chosen threshold. Using single thresholds, as common in the literature, is thus not advisable. Applying a multi-threshold framework, we quantify how the complementarity of wind and solar power temporally and spatially alleviates drought frequency, duration, and severity within (portfolio effect) and across countries (balancing effect). We further identify the most extreme droughts and show how these drive major discharging periods of long-duration storage in a fully renewable European energy system. Such events comprise sequences of shorter, contiguous droughts of varying severity. In a perfectly interconnected Europe, the most extreme drought event occurred in winter 1996/97 and lasted 55~days. Yet, the average renewable portfolio availability during this event was still 47% of its long-run mean. As extreme droughts may span across the turn of years, single calendar year planning horizons are not suitable for modeling weather-resilient future energy scenarios.
Koopman Operator in the Weighted Function Spaces and its Learning for the Estimation of Lyapunov and Zubov Functions
The mathematical properties and data-driven learning of the Koopman operator, which represents nonlinear dynamics as a linear mapping on a properly defined functional spaces, have become key problems in nonlinear system identification and control. However, Koopman operators that are approximately learned from snapshot data may not always accurately predict the system evolution on long horizons. In this work, by defining the Koopman operator on a space of weighted continuous functions and learning it on a weighted reproducing kernel Hilbert space, the Koopman operator is guaranteed to be contractive and the accumulation learning error is bounded. The weighting function, assumed to be known a priori, has an exponential decay with the flow or decays exponentially when compensated by an exponential factor. Under such a construction, the Koopman operator learned from data is used to estimate (i) Lyapunov functions for globally asymptotically stable dynamics, and (ii) Zubov-Lyapunov functions that characterize the domain of attraction. For these estimations, probabilistic bounds on the errors are derived.
comment: 8 pages, 3 figures, submitted to 2025 American Control Conference
A Data-Driven Approach To Preserve Safety and Reference Tracking for Constrained Cyber-Physical Systems Under Network Attacks
This paper proposes a worst-case data-driven control architecture capable of ensuring the safety of constrained Cyber-Physical Systems under cyber-attacks while minimizing, whenever possible, potential degradation in tracking performance. To this end, a data-driven robust anomaly detector is designed to detect cyber-attack occurrences. Moreover, an add-on tracking supervisor module allows safe open-loop tracking control operations in case of unreliable measurements. On the plant side, a safety verification module and a local emergency controller are designed to manage severe attack scenarios that cannot be handled on the controller's side. These two modules resort to worst-case reachability and controllability data-driven arguments to detect potential unsafe scenarios and replace, whenever strictly needed, the tracking controller with emergency actions whose objective is to steer the plant's state trajectory in a predefined set of admissible and safe robust control invariant region until an attack-free scenario is restored. The effectiveness of the proposed solution has been shown through a simulation example.
comment: Preprint of a journal manuscript submitted to the IEEE Transactions on Automatic Control
Analysis of human steering behavior differences in human-in-control and autonomy-in-control driving
Steering models (such as the generalized two-point model) predict human steering behavior well when the human is in direct control of a vehicle. In vehicles under autonomous control, human control inputs are not used; rather, an autonomous controller applies steering and acceleration commands to the vehicle. For example, human steering input may be used for state estimation rather than direct control. We show that human steering behavior changes when the human no longer directly controls the vehicle and the two are instead working in a shared autonomy paradigm. Thus, when a vehicle is not under direct human control, steering models like the generalized two-point model do not predict human steering behavior. We also show that the error between predicted human steering behavior and actual human steering behavior reflects a fundamental difference when the human directly controls the vehicle compared to when the vehicle is autonomously controlled. Moreover, we show that a single distribution describes the error between predicted human steering behavior and actual human steering behavior when the human's steering inputs are used for state estimation and the vehicle is autonomously controlled, indicating there may be a underlying model for human steering behavior under this type of shared autonomous control. Future work includes determining this shared autonomous human steering model and demonstrating its performance.
comment: 6 pages, 10 figures, accepted for publication at the 5th IFAC at the 5th IFAC Workshop on Cyber-Physical Human Systems
Constraint-Aware Refinement for Safety Verification of Neural Feedback Loops
Neural networks (NNs) are becoming increasingly popular in the design of control pipelines for autonomous systems. However, since the performance of NNs can degrade in the presence of out-of-distribution data or adversarial attacks, systems that have NNs in their control pipelines, i.e., neural feedback loops (NFLs), need safety assurances before they can be applied in safety-critical situations. Reachability analysis offers a solution to this problem by calculating reachable sets that bound the possible future states of an NFL and can be checked against dangerous regions of the state space to verify that the system does not violate safety constraints. Since exact reachable sets are generally intractable to calculate, reachable set over approximations (RSOAs) are typically used. The problem with RSOAs is that they can be overly conservative, making it difficult to verify the satisfaction of safety constraints, especially over long time horizons or for highly nonlinear NN control policies. Refinement strategies such as partitioning or symbolic propagation are typically used to limit the conservativeness of RSOAs, but these approaches come with a high computational cost and often can only be used to verify safety for simple reachability problems. This paper presents Constraint-Aware Refinement for Verification (CARV): an efficient refinement strategy that reduces the conservativeness of RSOAs by explicitly using the safety constraints on the NFL to refine RSOAs only where necessary. We demonstrate that CARV can verify the safety of an NFL where other approaches either fail or take up to 60x longer and 40x the memory.
comment: 6 pages, 10 figures, submitted to L-CSS/ACC
Seasonal Performance Evaluation of a Hybrid PV-Wind-Battery Power System for a Mars Base
This work investigates a hybrid photovoltaic-wind-battery power system designed to sustain a Mars base under varying seasonal and climatic conditions. The Mars Climate Database was utilized to simulate the effects of seasonal changes, diurnal cycles, and dust storms on the system's power generation. The seasonal performance was analyzed across the Martian surface and at potential habitation sites proposed in the "First Landing Site/Exploration Zone Workshop for Human Missions to the Surface of Mars (FLSW).'' Within the hybrid system, the photovoltaic arrays serve as the primary energy source, with wind turbines providing essential backup during nighttime and dust storms. A single $1\,000\,\mathrm{m}^2$ photovoltaic array, a $33.4\,\mathrm{m}$ diameter wind turbine, and a $312\,\mathrm{kWh}$ battery can support a six-person Mars base at $32.1\%$ of the Martian surface during the equinoxes and solstices, expanding to $51.7\%$ with three sets of arrays and turbines. Additionally, $24$ FLSW sites can be supported throughout the solstices and equinoxes by a single photovoltaic array, turbine, and battery, even during global dust storms. Among the $24$ sites, Hebrus Valles, Huygens Crater, and Noctis Labyrinthus had the highest energy production potential. These findings are expected to guide further research on hybrid renewable power systems for Mars exploration.
comment: The peer-reviewed paper will be presented at The 2024 International Conference on Electric Power and Energy Conversion Systems (EPECS). The data used in this work are available from https://github.com/AbdollahMasoud/EPECS-2024
PREPARE: PREdicting PAndemic's REcurring Waves Amidst Mutations, Vaccination, and Lockdowns
This study releases an adaptable framework that can provide insights to policymakers to predict the complex recurring waves of the pandemic in the medium postemergence of the virus spread, a phase marked by rapidly changing factors like virus mutations, lockdowns, and vaccinations, offering a way to forecast infection trends and stay ahead of future outbreaks even amidst uncertainty. The proposed model is validated on data from COVID-19 spread in Germany.
Stochastic Opinion Dynamics under Social Pressure in Arbitrary Networks
Social pressure is a key factor affecting the evolution of opinions on networks in many types of settings, pushing people to conform to their neighbors' opinions. To study this, the interacting Polya urn model was introduced by Jadbabaie et al., in which each agent has two kinds of opinion: inherent beliefs, which are hidden from the other agents and fixed; and declared opinions, which are randomly sampled at each step from a distribution which depends on the agent's inherent belief and her neighbors' past declared opinions (the social pressure component), and which is then communicated to her neighbors. Each agent also has a bias parameter denoting her level of resistance to social pressure. At every step, each agent updates her declared opinion (simultaneously with all other agents) according to her neighbors' aggregate past declared opinions, her inherent belief, and her bias parameter. We study the asymptotic behavior of this opinion dynamics model and show that the agents' declaration probabilities approaches a set of equilibrium points of the expected dynamics using Lyapunov theory and stochastic approximation techniques. We also derive necessary and sufficient conditions for the agents to approach consensus on their declared opinions. Our work provides further insight into the difficulty of inferring the inherent beliefs of agents when they are under social pressure.
comment: Updated cited theorems (and proofs included)
Efficient Path Planning in Large Unknown Environments with Switchable System Models for Automated Vehicles
Large environments are challenging for path planning algorithms as the size of the configuration space increases. Furthermore, if the environment is mainly unexplored, large amounts of the path are planned through unknown areas. Hence, a complete replanning of the entire path occurs whenever the path collides with newly discovered obstacles. We propose a novel method that stops the path planning algorithm after a certain distance. It is used to navigate the algorithm in large environments and is not prone to problems of existing navigation approaches. Furthermore, we developed a method to detect significant environment changes to allow a more efficient replanning. At last, we extend the path planner to be used in the U-Shift concept vehicle. It can switch to another system model and rotate around the center of its rear axis. The results show that the proposed methods generate nearly identical paths compared to the standard Hybrid A* while drastically reducing the execution time. Furthermore, we show that the extended path planning algorithm enables the efficient use of the maneuvering capabilities of the concept vehicle to plan concise paths in narrow environments.
Experimenting with Adaptive Bitrate Algorithms for Virtual Reality Streaming over Wi-Fi
Interactive Virtual Reality (VR) streaming over Wi-Fi networks encounters significant challenges due to bandwidth fluctuations caused by channel contention and user mobility. Adaptive BitRate (ABR) algorithms dynamically adjust the video encoding bitrate based on the available network capacity, aiming to maximize image quality while mitigating congestion and preserving the user's Quality of Experience (QoE). In this paper, we experiment with ABR algorithms for VR streaming using Air Light VR (ALVR), an open-source VR streaming solution. We extend ALVR with a comprehensive set of metrics that provide a robust characterization of the network's state, enabling more informed bitrate adjustments. To demonstrate the utility of these performance indicators, we develop and test the Network-aware Step-wise ABR algorithm for VR streaming (NeSt-VR). Results validate the accuracy of the newly implemented network performance metrics and demonstrate NeSt-VR's video bitrate adaptation capabilities.
A Hypergraph Approach to Distributed Broadcast
This paper explores the distributed broadcast problem within the context of network communications, a critical challenge in decentralized information dissemination. We put forth a novel hypergraph-based approach to address this issue, focusing on minimizing the number of broadcasts to ensure comprehensive data sharing among all network users. The key contributions of this work include the establishment of a general lower bound for the problem using the min-cut capacity of hypergraphs, and a distributed broadcast for quasi-trees (DBQT) algorithm tailored for the unique structure of quasi-trees, which is proven to be optimal. This paper advances both network communication strategies and hypergraph theory, with implications for a wide range of real-world applications, from vehicular and sensor networks to distributed storage systems.
Machine Learning for Equitable Load Shedding: Real-time Solution via Learning Binding Constraints
Timely and effective load shedding in power systems is critical for maintaining supply-demand balance and preventing cascading blackouts. To eliminate load shedding bias against specific regions in the system, optimization-based methods are uniquely positioned to help balance between economical and equity considerations. However, the resulting optimization problem involves complex constraints, which can be time-consuming to solve and thus cannot meet the real-time requirements of load shedding. To tackle this challenge, in this paper we present an efficient machine learning algorithm to enable millisecond-level computation for the optimization-based load shedding problem. Numerical studies on both a 3-bus toy example and a realistic RTS-GMLC system have demonstrated the validity and efficiency of the proposed algorithm for delivering equitable and real-time load shedding decisions.
RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged Locomotion
This paper presents a control framework that combines model-based optimal control and reinforcement learning (RL) to achieve versatile and robust legged locomotion. Our approach enhances the RL training process by incorporating on-demand reference motions generated through finite-horizon optimal control, covering a broad range of velocities and gaits. These reference motions serve as targets for the RL policy to imitate, leading to the development of robust control policies that can be learned with reliability. Furthermore, by utilizing realistic simulation data that captures whole-body dynamics, RL effectively overcomes the inherent limitations in reference motions imposed by modeling simplifications. We validate the robustness and controllability of the RL training process within our framework through a series of experiments. In these experiments, our method showcases its capability to generalize reference motions and effectively handle more complex locomotion tasks that may pose challenges for the simplified model, thanks to RL's flexibility. Additionally, our framework effortlessly supports the training of control policies for robots with diverse dimensions, eliminating the necessity for robot-specific adjustments in the reward function and hyperparameters.
comment: The paper has been accepted for publication in IEEE Robotics and Automation Letters (RA-L). You can find the copyright information on the front page of the paper. The supplementary video is available in https://www.youtube.com/watch?v=qPttVfzGS84
Safety Control of Uncertain MIMO Systems Using Dynamic Output Feedback Barrier Pairs
Safety control of dynamical systems using barrier functions relies on knowing the full state information. This paper introduces a novel approach for safety control in uncertain MIMO systems with partial state information. The proposed method combines the synthesis of a vector norm barrier function and a dynamic output feedback safety controller to ensure robust safety enforcement. The safety controller guarantees the invariance of the barrier function under uncertain dynamics and disturbances. To address the challenges associated with safety verification using partial state information, a barrier function estimator is developed. This estimator employs an identifier-based state estimator to obtain a state estimate that is affine in the uncertain model parameters of the system. By incorporating a priori knowledge of the limits of the uncertain model parameters and disturbances, the state estimate provides a robust upper bound for the barrier function. Comparative analysis with existing control barrier function based methods shows the advantage of the proposed approach in enforcing safety constraints under tight input constraints and the utilization of estimated state information.
Market Implications of Alternative Operating Reserve Modeling in Wholesale Electricity Markets
Pricing and settlement mechanisms are crucial for efficient re-source allocation, investment incentives, market competition, and regulatory oversight. In the United States, Regional Transmission Operators (RTOs) adopts a uniform pricing scheme that hinges on the marginal costs of supplying additional electricity. This study investigates the pricing and settlement impacts of alternative reserve constraint modeling, highlighting how even slight variations in the modeling of constraints can drastically alter market clearing prices, reserve quantities, and revenue outcomes. Focusing on the diverse market designs and assumptions in ancillary services by U.S. RTOs, particularly in relation to capacity sharing and reserve substitutions, the research examines four distinct models that combine these elements based on a large-scale synthetic power system test data. Our study provides a critical insight into the economic implications and the underlying factors of these alternative reserve constraints through market simulations and data analysis.
Robotics
Generalizability of Graph Neural Networks for Decentralized Unlabeled Motion Planning ICRA 2025
Unlabeled motion planning involves assigning a set of robots to target locations while ensuring collision avoidance, aiming to minimize the total distance traveled. The problem forms an essential building block for multi-robot systems in applications such as exploration, surveillance, and transportation. We address this problem in a decentralized setting where each robot knows only the positions of its $k$-nearest robots and $k$-nearest targets. This scenario combines elements of combinatorial assignment and continuous-space motion planning, posing significant scalability challenges for traditional centralized approaches. To overcome these challenges, we propose a decentralized policy learned via a Graph Neural Network (GNN). The GNN enables robots to determine (1) what information to communicate to neighbors and (2) how to integrate received information with local observations for decision-making. We train the GNN using imitation learning with the centralized Hungarian algorithm as the expert policy, and further fine-tune it using reinforcement learning to avoid collisions and enhance performance. Extensive empirical evaluations demonstrate the scalability and effectiveness of our approach. The GNN policy trained on 100 robots generalizes to scenarios with up to 500 robots, outperforming state-of-the-art solutions by 8.6\% on average and significantly surpassing greedy decentralized methods. This work lays the foundation for solving multi-robot coordination problems in settings where scalability is important.
comment: 6 pages, 6 figures, submitted to ICRA 2025
Grounded Curriculum Learning
The high cost of real-world data for robotics Reinforcement Learning (RL) leads to the wide usage of simulators. Despite extensive work on building better dynamics models for simulators to match with the real world, there is another, often-overlooked mismatch between simulations and the real world, namely the distribution of available training tasks. Such a mismatch is further exacerbated by existing curriculum learning techniques, which automatically vary the simulation task distribution without considering its relevance to the real world. Considering these challenges, we posit that curriculum learning for robotics RL needs to be grounded in real-world task distributions. To this end, we propose Grounded Curriculum Learning (GCL), which aligns the simulated task distribution in the curriculum with the real world, as well as explicitly considers what tasks have been given to the robot and how the robot has performed in the past. We validate GCL using the BARN dataset on complex navigation tasks, achieving a 6.8% and 6.5% higher success rate compared to a state-of-the-art CL method and a curriculum designed by human experts, respectively. These results show that GCL can enhance learning efficiency and navigation performance by grounding the simulation task distribution in the real world within an adaptive curriculum.
comment: 8 pages, 4 figures
The Duke Humanoid: Design and Control For Energy Efficient Bipedal Locomotion Using Passive Dynamics ICRA 2025
We present the Duke Humanoid, an open-source 10-degrees-of-freedom humanoid, as an extensible platform for locomotion research. The design mimics human physiology, with minimized leg distances and symmetrical body alignment in the frontal plane to maintain static balance with straight knees. We develop a reinforcement learning policy that can be deployed zero-shot on the hardware for velocity-tracking walking tasks. Additionally, to enhance energy efficiency in locomotion, we propose an end-to-end reinforcement learning algorithm that encourages the robot to leverage passive dynamics. Our experiment results show that our passive policy reduces the cost of transport by up to $50\%$ in simulation and $31\%$ in real-world testing. Our website is http://generalroboticslab.com/DukeHumanoidv1/ .
comment: submitted to ICRA 2025
4D Metric-Semantic Mapping for Persistent Orchard Monitoring: Method and Dataset
Automated persistent and fine-grained monitoring of orchards at the individual tree or fruit level helps maximize crop yield and optimize resources such as water, fertilizers, and pesticides while preventing agricultural waste. Towards this goal, we present a 4D spatio-temporal metric-semantic mapping method that fuses data from multiple sensors, including LiDAR, RGB camera, and IMU, to monitor the fruits in an orchard across their growth season. A LiDAR-RGB fusion module is designed for 3D fruit tracking and localization, which first segments fruits using a deep neural network and then tracks them using the Hungarian Assignment algorithm. Additionally, the 4D data association module aligns data from different growth stages into a common reference frame and tracks fruits spatio-temporally, providing information such as fruit counts, sizes, and positions. We demonstrate our method's accuracy in 4D metric-semantic mapping using data collected from a real orchard under natural, uncontrolled conditions with seasonal variations. We achieve a 3.1 percent error in total fruit count estimation for over 1790 fruits across 60 apple trees, along with accurate size estimation results with a mean error of 1.1 cm. The datasets, consisting of LiDAR, RGB, and IMU data of five fruit species captured across their growth seasons, along with corresponding ground truth data, will be made publicly available at: https://4d-metric-semantic-mapping.org/
Lessons Learned from Developing a Human-Centered Guide Dog Robot for Mobility Assistance
While guide dogs offer essential mobility assistance, their high cost, limited availability, and care requirements make them inaccessible to most blind or low vision (BLV) individuals. Recent advances in quadruped robots provide a scalable solution for mobility assistance, but many current designs fail to meet real-world needs due to a lack of understanding of handler and guide dog interactions. In this paper, we share lessons learned from developing a human-centered guide dog robot, addressing challenges such as optimal hardware design, robust navigation, and informative scene description for user adoption. By conducting semi-structured interviews and human experiments with BLV individuals, guide-dog handlers, and trainers, we identified key design principles to improve safety, trust, and usability in robotic mobility aids. Our findings lay the building blocks for future development of guide dog robots, ultimately enhancing independence and quality of life for BLV individuals.
Learning Wheelchair Tennis Navigation from Broadcast Videos with Domain Knowledge Transfer and Diffusion Motion Planning ICRA
In this paper, we propose a novel and generalizable zero-shot knowledge transfer framework that distills expert sports navigation strategies from web videos into robotic systems with adversarial constraints and out-of-distribution image trajectories. Our pipeline enables diffusion-based imitation learning by reconstructing the full 3D task space from multiple partial views, warping it into 2D image space, closing the planning loop within this 2D space, and transfer constrained motion of interest back to task space. Additionally, we demonstrate that the learned policy can serve as a local planner in conjunction with position control. We apply this framework in the wheelchair tennis navigation problem to guide the wheelchair into the ball-hitting region. Our pipeline achieves a navigation success rate of 97.67% in reaching real-world recorded tennis ball trajectories with a physical robot wheelchair, and achieve a success rate of 68.49% in a real-world, real-time experiment on a full-sized tennis court.
comment: This manuscript has been submitted to 2025 IEEE International Conference on Robotics & Automation (ICRA)
GelSlim 4.0: Focusing on Touch and Reproducibility ICRA 2025
Tactile sensing provides robots with rich feedback during manipulation, enabling a host of perception and controls capabilities. Here, we present a new open-source, vision-based tactile sensor designed to promote reproducibility and accessibility across research and hobbyist communities. Building upon the GelSlim 3.0 sensor, our design features two key improvements: a simplified, modifiable finger structure and easily manufacturable lenses. To complement the hardware, we provide an open-source perception library that includes depth and shear field estimation algorithms to enable in-hand pose estimation, slip detection, and other manipulation tasks. Our sensor is accompanied by comprehensive manufacturing documentation, ensuring the design can be readily produced by users with varying levels of expertise. We validate the sensor's reproducibility through extensive human usability testing. For documentation, code, and data, please visit the project website: https://www.mmintlab.com/research/gelslim-4-0/
comment: Submitted to ICRA 2025. For documentation, code, and data, please visit the project website: https://www.mmintlab.com/research/gelslim-4-0/
Learning Robust Policies via Interpretable Hamilton-Jacobi Reachability-Guided Disturbances
Deep Reinforcement Learning (RL) has shown remarkable success in robotics with complex and heterogeneous dynamics. However, its vulnerability to unknown disturbances and adversarial attacks remains a significant challenge. In this paper, we propose a robust policy training framework that integrates model-based control principles with adversarial RL training to improve robustness without the need for external black-box adversaries. Our approach introduces a novel Hamilton-Jacobi reachability-guided disturbance for adversarial RL training, where we use interpretable worst-case or near-worst-case disturbances as adversaries against the robust policy. We evaluated its effectiveness across three distinct tasks: a reach-avoid game in both simulation and real-world settings, and a highly dynamic quadrotor stabilization task in simulation. We validate that our learned critic network is consistent with the ground-truth HJ value function, while the policy network shows comparable performance with other learning-based methods.
Obstacle-Aware Quadrupedal Locomotion With Resilient Multi-Modal Reinforcement Learning
Quadrupedal robots hold promising potential for applications in navigating cluttered environments with resilience akin to their animal counterparts. However, their floating base configuration makes them vulnerable to real-world uncertainties, yielding substantial challenges in their locomotion control. Deep reinforcement learning has become one of the plausible alternatives for realizing a robust locomotion controller. However, the approaches that rely solely on proprioception sacrifice collision-free locomotion because they require front-feet contact to detect the presence of stairs to adapt the locomotion gait. Meanwhile, incorporating exteroception necessitates a precisely modeled map observed by exteroceptive sensors over a period of time. Therefore, this work proposes a novel method to fuse proprioception and exteroception featuring a resilient multi-modal reinforcement learning. The proposed method yields a controller that showcases agile locomotion performance on a quadrupedal robot over a myriad of real-world courses, including rough terrains, steep slopes, and high-rise stairs, while retaining its robustness against out-of-distribution situations.
comment: Under review. Project site is available at https://dreamwaqpp.github.io
Fine-Tuning Hybrid Physics-Informed Neural Networks for Vehicle Dynamics Model Estimation
Accurate dynamic modeling is critical for autonomous racing vehicles, especially during high-speed and agile maneuvers where precise motion prediction is essential for safety. Traditional parameter estimation methods face limitations such as reliance on initial guesses, labor-intensive fitting procedures, and complex testing setups. On the other hand, purely data-driven machine learning methods struggle to capture inherent physical constraints and typically require large datasets for optimal performance. To address these challenges, this paper introduces the Fine-Tuning Hybrid Dynamics (FTHD) method, which integrates supervised and unsupervised Physics-Informed Neural Networks (PINNs), combining physics-based modeling with data-driven techniques. FTHD fine-tunes a pre-trained Deep Dynamics Model (DDM) using a smaller training dataset, delivering superior performance compared to state-of-the-art methods such as the Deep Pacejka Model (DPM) and outperforming the original DDM. Furthermore, an Extended Kalman Filter (EKF) is embedded within FTHD (EKF-FTHD) to effectively manage noisy real-world data, ensuring accurate denoising while preserving the vehicle's essential physical characteristics. The proposed FTHD framework is validated through scaled simulations using the BayesRace Physics-based Simulator and full-scale real-world experiments from the Indy Autonomous Challenge. Results demonstrate that the hybrid approach significantly improves parameter estimation accuracy, even with reduced data, and outperforms existing models. EKF-FTHD enhances robustness by denoising real-world data while maintaining physical insights, representing a notable advancement in vehicle dynamics modeling for high-speed autonomous racing.
LiRA: Light-Robust Adversary for Model-based Reinforcement Learning in Real World
Model-based reinforcement learning has attracted much attention due to its high sample efficiency and is expected to be applied to real-world robotic applications. In the real world, as unobservable disturbances can lead to unexpected situations, robot policies should be taken to improve not only control performance but also robustness. Adversarial learning is an effective way to improve robustness, but excessive adversary would increase the risk of malfunction, and make the control performance too conservative. Therefore, this study addresses a new adversarial learning framework to make reinforcement learning robust moderately and not conservative too much. To this end, the adversarial learning is first rederived with variational inference. In addition, light robustness, which allows for maximizing robustness within an acceptable performance degradation, is utilized as a constraint. As a result, the proposed framework, so-called LiRA, can automatically adjust adversary level, balancing robustness and conservativeness. The expected behaviors of LiRA are confirmed in numerical simulations. In addition, LiRA succeeds in learning a force-reactive gait control of a quadrupedal robot only with real-world data collected less than two hours.
comment: 18 pages, 15 figures
CELLmap: Enhancing LiDAR SLAM through Elastic and Lightweight Spherical Map Representation
SLAM is a fundamental capability of unmanned systems, with LiDAR-based SLAM gaining widespread adoption due to its high precision. Current SLAM systems can achieve centimeter-level accuracy within a short period. However, there are still several challenges when dealing with largescale mapping tasks including significant storage requirements and difficulty of reusing the constructed maps. To address this, we first design an elastic and lightweight map representation called CELLmap, composed of several CELLs, each representing the local map at the corresponding location. Then, we design a general backend including CELL-based bidirectional registration module and loop closure detection module to improve global map consistency. Our experiments have demonstrated that CELLmap can represent the precise geometric structure of large-scale maps of KITTI dataset using only about 60 MB. Additionally, our general backend achieves up to a 26.88% improvement over various LiDAR odometry methods.
comment: 7 pages, 5 figures
RoboNurse-VLA: Robotic Scrub Nurse System based on Vision-Language-Action Model
In modern healthcare, the demand for autonomous robotic assistants has grown significantly, particularly in the operating room, where surgical tasks require precision and reliability. Robotic scrub nurses have emerged as a promising solution to improve efficiency and reduce human error during surgery. However, challenges remain in terms of accurately grasping and handing over surgical instruments, especially when dealing with complex or difficult objects in dynamic environments. In this work, we introduce a novel robotic scrub nurse system, RoboNurse-VLA, built on a Vision-Language-Action (VLA) model by integrating the Segment Anything Model 2 (SAM 2) and the Llama 2 language model. The proposed RoboNurse-VLA system enables highly precise grasping and handover of surgical instruments in real-time based on voice commands from the surgeon. Leveraging state-of-the-art vision and language models, the system can address key challenges for object detection, pose optimization, and the handling of complex and difficult-to-grasp instruments. Through extensive evaluations, RoboNurse-VLA demonstrates superior performance compared to existing models, achieving high success rates in surgical instrument handovers, even with unseen tools and challenging items. This work presents a significant step forward in autonomous surgical assistance, showcasing the potential of integrating VLA models for real-world medical applications. More details can be found at https://robonurse-vla.github.io.
Leveraging Surgical Activity Grammar for Primary Intention Prediction in Laparoscopy Procedures ICRA 2025
Surgical procedures are inherently complex and dynamic, with intricate dependencies and various execution paths. Accurate identification of the intentions behind critical actions, referred to as Primary Intentions (PIs), is crucial to understanding and planning the procedure. This paper presents a novel framework that advances PI recognition in instructional videos by combining top-down grammatical structure with bottom-up visual cues. The grammatical structure is based on a rich corpus of surgical procedures, offering a hierarchical perspective on surgical activities. A grammar parser, utilizing the surgical activity grammar, processes visual data obtained from laparoscopic images through surgical action detectors, ensuring a more precise interpretation of the visual information. Experimental results on the benchmark dataset demonstrate that our method outperforms existing surgical activity detectors that rely solely on visual features. Our research provides a promising foundation for developing advanced robotic surgical systems with enhanced planning and automation capabilities.
comment: Submitted to ICRA 2025
Fast-Convergent and Communication-Alleviated Heterogeneous Hierarchical Federated Learning in Autonomous Driving
Street Scene Semantic Understanding (denoted as TriSU) is a complex task for autonomous driving (AD). However, inference model trained from data in a particular geographical region faces poor generalization when applied in other regions due to inter-city data domain-shift. Hierarchical Federated Learning (HFL) offers a potential solution for improving TriSU model generalization by collaborative privacy-preserving training over distributed datasets from different cities. Unfortunately, it suffers from slow convergence because data from different cities are with disparate statistical properties. Going beyond existing HFL methods, we propose a Gaussian heterogeneous HFL algorithm (FedGau) to address inter-city data heterogeneity so that convergence can be accelerated. In the proposed FedGau algorithm, both single RGB image and RGB dataset are modelled as Gaussian distributions for aggregation weight design. This approach not only differentiates each RGB image by respective statistical distribution, but also exploits the statistics of dataset from each city in addition to the conventionally considered data volume. With the proposed approach, the convergence is accelerated by 35.5\%-40.6\% compared to existing state-of-the-art (SOTA) HFL methods. On the other hand, to reduce the involved communication resource, we further introduce a novel performance-aware adaptive resource scheduling (AdapRS) policy. Unlike the traditional static resource scheduling policy that exchanges a fixed number of models between two adjacent aggregations, AdapRS adjusts the number of model aggregation at different levels of HFL so that unnecessary communications are minimized. Extensive experiments demonstrate that AdapRS saves 29.65\% communication overhead compared to conventional static resource scheduling policy while maintaining almost the same performance.
comment: 16 pages
Multi-Query Shortest-Path Problem in Graphs of Convex Sets
The Shortest-Path Problem in Graph of Convex Sets (SPP in GCS) is a recently developed optimization framework that blends discrete and continuous decision making. Many relevant problems in robotics, such as collision-free motion planning, can be cast and solved as an SPP in GCS, yielding lower-cost solutions and faster runtimes than state-of-the-art algorithms. In this paper, we are motivated by motion planning of robot arms that must operate swiftly in static environments. We consider a multi-query extension of the SPP in GCS, where the goal is to efficiently precompute optimal paths between given sets of initial and target conditions. Our solution consists of two stages. Offline, we use semidefinite programming to compute a coarse lower bound on the problem's cost-to-go function. Then, online, this lower bound is used to incrementally generate feasible paths by solving short-horizon convex programs. For a robot arm with seven joints, our method designs higher quality trajectories up to two orders of magnitude faster than existing motion planners.
comment: To appear in: The International Workshop on the Algorithmic Foundations of Robotics, WAFR 2024
FoAM: Foresight-Augmented Multi-Task Imitation Policy for Robotic Manipulation
Multi-task imitation learning (MTIL) has shown significant potential in robotic manipulation by enabling agents to perform various tasks using a unified policy. This simplifies the policy deployment and enhances the agent's adaptability across different contexts. However, key challenges remain, such as maintaining action reliability (e.g., avoiding abnormal action sequences that deviate from nominal task trajectories), distinguishing between similar tasks, and generalizing to unseen scenarios. To address these challenges, we introduce the Foresight-Augmented Manipulation Policy (FoAM), an innovative MTIL framework. FoAM not only learns to mimic expert actions but also predicts the visual outcomes of those actions to enhance decision-making. Additionally, it integrates multi-modal goal inputs, such as visual and language prompts, overcoming the limitations of single-conditioned policies. We evaluated FoAM across over 100 tasks in both simulation and real-world settings, demonstrating that it significantly improves IL policy performance, outperforming current state-of-the-art IL baselines by up to 41% in success rate. Furthermore, we released a simulation benchmark for robotic manipulation, featuring 10 task suites and over 80 challenging tasks designed for multi-task policy training and evaluation. See project homepage https://projFoAM.github.io/ for project details.
comment: 8 pages, 4 figures
Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface
Collecting real-world manipulation trajectory data involving robotic arms is essential for developing general-purpose action policies in robotic manipulation, yet such data remains scarce. Existing methods face limitations such as high costs, labor intensity, hardware dependencies, and complex setup requirements involving SLAM algorithms. In this work, we introduce Fast-UMI, an interface-mediated manipulation system comprising two key components: a handheld device operated by humans for data collection and a robot-mounted device used during policy inference. Our approach employs a decoupled design compatible with a wide range of grippers while maintaining consistent observation perspectives, allowing models trained on handheld-collected data to be directly applied to real robots. By directly obtaining the end-effector pose using existing commercial hardware products, we eliminate the need for complex SLAM deployment and calibration, streamlining data processing. Fast-UMI provides supporting software tools for efficient robot learning data collection and conversion, facilitating rapid, plug-and-play functionality. This system offers an efficient and user-friendly tool for robotic learning data acquisition.
OptiGrasp: Optimized Grasp Pose Detection Using RGB Images for Warehouse Picking Robots
In warehouse environments, robots require robust picking capabilities to manage a wide variety of objects. Effective deployment demands minimal hardware, strong generalization to new products, and resilience in diverse settings. Current methods often rely on depth sensors for structural information, which suffer from high costs, complex setups, and technical limitations. Inspired by recent advancements in computer vision, we propose an innovative approach that leverages foundation models to enhance suction grasping using only RGB images. Trained solely on a synthetic dataset, our method generalizes its grasp prediction capabilities to real-world robots and a diverse range of novel objects not included in the training set. Our network achieves an 82.3\% success rate in real-world applications. The project website with code and data will be available at http://optigrasp.github.io.
comment: 8 pages, 6 figures
KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation
Depth perception is essential for a robot's spatial and geometric understanding of its environment, with many tasks traditionally relying on hardware-based depth sensors like RGB-D or stereo cameras. However, these sensors face practical limitations, including issues with transparent and reflective objects, high costs, calibration complexity, spatial and energy constraints, and increased failure rates in compound systems. While monocular depth estimation methods offer a cost-effective and simpler alternative, their adoption in robotics is limited due to their output of relative rather than metric depth, which is crucial for robotics applications. In this paper, we propose a method that utilizes a single calibrated camera, enabling the robot to act as a ``measuring stick" to convert relative depth estimates into metric depth in real-time as tasks are performed. Our approach employs an LSTM-based metric depth regressor, trained online and refined through probabilistic filtering, to accurately restore the metric depth across the monocular depth map, particularly in areas proximal to the robot's motion. Experiments with real robots demonstrate that our method significantly outperforms current state-of-the-art monocular metric depth estimation techniques, achieving a 22.1% reduction in depth error and a 52% increase in success rate for a downstream task.
comment: 8 pages, 5 figures
MotionScript: Natural Language Descriptions for Expressive 3D Human Motions
This paper proposes MotionScript, a motion-to-text conversion algorithm and natural language representation for human body motions. MotionScript provides more detailed and accurate descriptions of human body movements compared to previous natural language methods. Most motion datasets focus on basic, well-defined actions, with limited variation in expression (e.g., sitting, walking, dribbling a ball). But for expressive actions that contain a diversity of movements in the class (e.g. being sad, dancing), or for actions outside the domain of standard motion capture datasets (e.g. stylistic walking, sign-language, interactions with animals), more specific and granular natural language descriptions are needed. Our proposed MotionScript descriptions differ from existing natural language representations in that it provides detailed descriptions in natural language rather than simple action labels or generalized captions. To the best of our knowledge, this is the first attempt at translating 3D motions to natural language descriptions without requiring training data. Our experiments demonstrate that MotionScript descriptions, when applied to text-to-motion tasks, enable large language models to generate complex, previously unseen motions. Additional examples, dataset, and code can be accessed at https://pjyazdian.github.io/MotionScript
comment: Project webpage: https://pjyazdian.github.io/MotionScript
SpiRobs: Logarithmic Spiral-shaped Robots for Versatile Grasping Across Scales
Realizing a soft manipulator with biologically comparable flexibility and versatility often requires careful selection of materials and actuation, as well as attentive design of its structure, perception, and control. Here, we report a new class of soft robots (SpiRobs) that morphologically replicates the logarithmic spiral pattern observed in natural appendages (e.g., octopus arms, elephant trunks, etc.). This allows for a common design principle across different scales and a speedy and inexpensive fabrication process. We further present a grasping strategy inspired by the octopus that can automatically adapt to a target object's size and shape. We illustrate the dexterity of SpiRobs and the ability to tightly grasp objects that vary in size by more than two orders of magnitude and up to 260 times self-weight. We demonstrate scalability via three additional variants: a miniaturized gripper (mm), a one-meter-long manipulator, and an array of SpiRobs that can tangle up various objects.
comment: 17 pages, 6 figures
Federated Multi-Agent Mapping for Planetary Exploration
Multi-agent robotic exploration stands to play an important role in space exploration as the next generation of spacecraft robotic systems venture to more extreme and far-flung environments. A key challenge in this new paradigm will be to effectively share and utilize the vast amount of data generated on-board while operating in bandwidth-constrained regimes such as those often found in space missions. Federated learning (FL) is a promising tool for bridging this gap for a host of tasks studied across proposed mission concepts. Drawing inspiration from the upcoming CADRE Lunar rover mission, we study the task of federated multi-agent mapping and propose an approach to jointly train a centralized map model across agents without the need to share raw data. Our approach leverages implicit neural mapping to generate parsimonious and adaptable representations. We further enhance this approach with meta-initialization on Earth datasets, pre-training the network to quickly adapt to extreme and rugged terrain. We demonstrate the efficacy of our proposed federated mapping approach using Martian terrains and glacier datasets and show how it outperforms benchmarks on map reconstruction losses as well as downstream path planning tasks.
comment: 7 pages, 5 figures
Affordance-Guided Reinforcement Learning via Visual Prompting
Robots equipped with reinforcement learning (RL) have the potential to learn a wide range of skills solely from a reward signal. However, obtaining a robust and dense reward signal for general manipulation tasks remains a challenge. Existing learning-based approaches require significant data, such as human demonstrations of success and failure, to learn task-specific reward functions. Recently, there is also a growing adoption of large multi-modal foundation models for robotics that can perform visual reasoning in physical contexts and generate coarse robot motions for manipulation tasks. Motivated by this range of capability, in this work, we present Keypoint-based Affordance Guidance for Improvements (KAGI), a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL. State-of-the-art VLMs have demonstrated impressive reasoning about affordances through keypoints in zero-shot, and we use these to define dense rewards that guide autonomous robotic learning. On real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 20K online fine-tuning steps. Additionally, we demonstrate the robustness of KAGI to reductions in the number of in-domain demonstrations used for pre-training, reaching similar performance in 35K online fine-tuning steps. Project website: https://sites.google.com/view/affordance-guided-rl
comment: 8 pages, 6 figures. Robotics: Science and Systems (RSS) 2024, Task Specification for General-Purpose Intelligent Robots & Lifelong Robot Learning Workshops
STAMP: Differentiable Task and Motion Planning via Stein Variational Gradient Descent
Planning for sequential robotics tasks often requires integrated symbolic and geometric reasoning. TAMP algorithms typically solve these problems by performing a tree search over high-level task sequences while checking for kinematic and dynamic feasibility. This can be inefficient because, typically, candidate task plans resulting from the tree search ignore geometric information. This often leads to motion planning failures that require expensive backtracking steps to find alternative task plans. We propose a novel approach to TAMP called Stein Task and Motion Planning (STAMP) that relaxes the hybrid optimization problem into a continuous domain. This allows us to leverage gradients from differentiable physics simulation to fully optimize discrete and continuous plan parameters for TAMP. In particular, we solve the optimization problem using a gradient-based variational inference algorithm called Stein Variational Gradient Descent. This allows us to find a distribution of solutions within a single optimization run. Furthermore, we use an off-the-shelf differentiable physics simulator that is parallelized on the GPU to run parallelized inference over diverse plan parameters. We demonstrate our method on a variety of problems and show that it can find multiple diverse plans in a single optimization run while also being significantly faster than existing approaches.
comment: 14 pages, 9 figures, Learning Effective Abstractions for Planning (LEAP) Workshop at CoRL 2023
Diffusion Models for Offline Multi-agent Reinforcement Learning with Safety Constraints
In recent advancements in Multi-agent Reinforcement Learning (MARL), its application has extended to various safety-critical scenarios. However, most methods focus on online learning, which presents substantial risks when deployed in real-world settings. Addressing this challenge, we introduce an innovative framework integrating diffusion models within the MARL paradigm. This approach notably enhances the safety of actions taken by multiple agents through risk mitigation while modeling coordinated action. Our framework is grounded in the Centralized Training with Decentralized Execution (CTDE) architecture, augmented by a Diffusion Model for prediction trajectory generation. Additionally, we incorporate a specialized algorithm to further ensure operational safety. We evaluate our model against baselines on the DSRL benchmark. Experiment results demonstrate that our model not only adheres to stringent safety constraints but also achieves superior performance compared to existing methodologies. This underscores the potential of our approach in advancing the safety and efficacy of MARL in real-world applications.
comment: The experiment and method plan are abolished and need to be redesigned
Autonomous Constellation Fault Monitoring with Inter-satellite Links: A Rigidity-Based Approach
To address the need for robust positioning, navigation, and timing services in lunar environments, this paper proposes a novel fault detection framework for satellite constellations using inter-satellite ranging (ISR). Traditionally, navigation satellites can depend on a robust network of ground-based stations for fault monitoring. However, due to cost constraints, a comprehensive ground segment on the lunar surface is impractical for lunar constellations. Our approach leverages vertex redundantly rigid graphs to detect faults without relying on precise ephemeris. We model satellite constellations as graphs where satellites are vertices and inter-satellite links are edges. We identify faults through the singular values of the geometric-centered Euclidean distance matrix (GCEDM) of 2-vertex redundantly rigid sub-graphs. The proposed method is validated through simulations of constellations around the Moon, demonstrating its effectiveness in various configurations. This research contributes to the reliable operation of satellite constellations for future lunar exploration missions.
comment: Submitted to ION GNSS+ 2024 Conference
Robot Task Planning and Situation Handling in Open Worlds
Automated task planning algorithms have been developed to help robots complete complex tasks that require multiple actions. Most of those algorithms have been developed for "closed worlds" assuming complete world knowledge is provided. However, the real world is generally open, and the robots frequently encounter unforeseen situations that can potentially break the planner's completeness. This paper introduces a novel algorithm (COWP) for open-world task planning and situation handling that dynamically augments the robot's action knowledge with task-oriented common sense. In particular, common sense is extracted from Large Language Models based on the current task at hand and robot skills. For systematic evaluations, we collected a dataset that includes 561 execution-time situations in a dining domain, where each situation corresponds to a state instance of a robot being potentially unable to complete a task using a solution that normally works. Experimental results show that our approach significantly outperforms competitive baselines from the literature in the success rate of service tasks. Additionally, we have demonstrated COWP using a mobile manipulator. The project website is available at: https://cowplanning.github.io/, where a more detailed version can also be found. This version has been accepted for publication in Autonomous Robots.
Multiagent Systems
DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model
Collaborative perception (CP) is emerging as a promising solution to the inherent limitations of stand-alone intelligence. However, current wireless communication systems are unable to support feature-level and raw-level collaborative algorithms due to their enormous bandwidth demands. In this paper, we propose DiffCP, a novel CP paradigm that utilizes a specialized diffusion model to efficiently compress the sensing information of collaborators. By incorporating both geometric and semantic conditions into the generative model, DiffCP enables feature-level collaboration with an ultra-low communication cost, advancing the practical implementation of CP systems. This paradigm can be seamlessly integrated into existing CP algorithms to enhance a wide range of downstream tasks. Through extensive experimentation, we investigate the trade-offs between communication, computation, and performance. Numerical results demonstrate that DiffCP can significantly reduce communication costs by 14.5-fold while maintaining the same performance as the state-of-the-art algorithm.
comment: 7 pages, 4 figures
Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization
This paper investigates distributed zeroth-order optimization for smooth nonconvex problems. We propose a novel variance-reduced gradient estimator, which randomly renovates one orthogonal direction of the true gradient in each iteration while leveraging historical snapshots for variance correction. By integrating this estimator with gradient tracking mechanism, we address the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation that exists in current zeroth-order distributed optimization algorithms, which rely on either the 2-point or $2d$-point gradient estimators. We derive a convergence rate of $\mathcal{O}(d^{\frac{5}{2}}/m)$ for smooth nonconvex functions in terms of sampling number $m$ and problem dimension $d$. Numerical simulations comparing our algorithm with existing methods confirm the effectiveness and efficiency of the proposed gradient estimator.
Graph Neural Networks with Model-based Reinforcement Learning for Multi-agent Systems NeurIPS 2024
Multi-agent systems (MAS) constitute a significant role in exploring machine intelligence and advanced applications. In order to deeply investigate complicated interactions within MAS scenarios, we originally propose "GNN for MBRL" model, which utilizes a state-spaced Graph Neural Networks with Model-based Reinforcement Learning to address specific MAS missions (e.g., Billiard-Avoidance, Autonomous Driving Cars). In detail, we firstly used GNN model to predict future states and trajectories of multiple agents, then applied the Cross-Entropy Method (CEM) optimized Model Predictive Control to assist the ego-agent planning actions and successfully accomplish certain MAS tasks.
comment: The paper abstract has been accepted by NeurIPS 2024 WiML Workshop.(https://www.wiml.org/events/wiml-workshop-%40-neurips-2024)
Federated Multi-Agent Mapping for Planetary Exploration
Multi-agent robotic exploration stands to play an important role in space exploration as the next generation of spacecraft robotic systems venture to more extreme and far-flung environments. A key challenge in this new paradigm will be to effectively share and utilize the vast amount of data generated on-board while operating in bandwidth-constrained regimes such as those often found in space missions. Federated learning (FL) is a promising tool for bridging this gap for a host of tasks studied across proposed mission concepts. Drawing inspiration from the upcoming CADRE Lunar rover mission, we study the task of federated multi-agent mapping and propose an approach to jointly train a centralized map model across agents without the need to share raw data. Our approach leverages implicit neural mapping to generate parsimonious and adaptable representations. We further enhance this approach with meta-initialization on Earth datasets, pre-training the network to quickly adapt to extreme and rugged terrain. We demonstrate the efficacy of our proposed federated mapping approach using Martian terrains and glacier datasets and show how it outperforms benchmarks on map reconstruction losses as well as downstream path planning tasks.
comment: 7 pages, 5 figures
CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence
In this paper, we consider solving the distributed optimization problem over a multi-agent network under the communication restricted setting. We study a compressed decentralized stochastic gradient method, termed ``compressed exact diffusion with adaptive stepsizes (CEDAS)", and show the method asymptotically achieves comparable convergence rate as centralized { stochastic gradient descent (SGD)} for both smooth strongly convex objective functions and smooth nonconvex objective functions under unbiased compression operators. In particular, to our knowledge, CEDAS enjoys so far the shortest transient time (with respect to the graph specifics) for achieving the convergence rate of centralized SGD, which behaves as $\mathcal{O}(n{C^3}/(1-\lambda_2)^{2})$ under smooth strongly convex objective functions, and $\mathcal{O}(n^3{C^6}/(1-\lambda_2)^4)$ under smooth nonconvex objective functions, where $(1-\lambda_2)$ denotes the spectral gap of the mixing matrix, and $C>0$ is the compression-related parameter. In particular, CEDAS exhibits the shortest transient times when $C < \mathcal{O}(1/(1 - \lambda_2)^2)$, which is common in practice. Numerical experiments further demonstrate the effectiveness of the proposed algorithm.
comment: 16 pages, 8 figures
Systems and Control (CS)
Generalizability of Graph Neural Networks for Decentralized Unlabeled Motion Planning ICRA 2025
Unlabeled motion planning involves assigning a set of robots to target locations while ensuring collision avoidance, aiming to minimize the total distance traveled. The problem forms an essential building block for multi-robot systems in applications such as exploration, surveillance, and transportation. We address this problem in a decentralized setting where each robot knows only the positions of its $k$-nearest robots and $k$-nearest targets. This scenario combines elements of combinatorial assignment and continuous-space motion planning, posing significant scalability challenges for traditional centralized approaches. To overcome these challenges, we propose a decentralized policy learned via a Graph Neural Network (GNN). The GNN enables robots to determine (1) what information to communicate to neighbors and (2) how to integrate received information with local observations for decision-making. We train the GNN using imitation learning with the centralized Hungarian algorithm as the expert policy, and further fine-tune it using reinforcement learning to avoid collisions and enhance performance. Extensive empirical evaluations demonstrate the scalability and effectiveness of our approach. The GNN policy trained on 100 robots generalizes to scenarios with up to 500 robots, outperforming state-of-the-art solutions by 8.6\% on average and significantly surpassing greedy decentralized methods. This work lays the foundation for solving multi-robot coordination problems in settings where scalability is important.
comment: 6 pages, 6 figures, submitted to ICRA 2025
Energy Saving and Traffic Steering Use Case and Testing by O-RAN RIC xApp/rApp Multi-vendor Interoperability
This paper discusses the use case of energy saving and traffic steering in O-RAN, the mechanism of multi-vendor interoperability to make it work and depict its test methodology.
comment: 6 pages, 8 figures
Adaptive Event-triggered Reinforcement Learning Control for Complex Nonlinear Systems
In this paper, we propose an adaptive event-triggered reinforcement learning control for continuous-time nonlinear systems, subject to bounded uncertainties, characterized by complex interactions. Specifically, the proposed method is capable of jointly learning both the control policy and the communication policy, thereby reducing the number of parameters and computational overhead when learning them separately or only one of them. By augmenting the state space with accrued rewards that represent the performance over the entire trajectory, we show that accurate and efficient determination of triggering conditions is possible without the need for explicit learning triggering conditions, thereby leading to an adaptive non-stationary policy. Finally, we provide several numerical examples to demonstrate the effectiveness of the proposed approach.
Parameter Estimation in Optimal Tolling for Traffic Networks Under the Markovian Traffic Equilibrium
Tolling, or congestion pricing, has emerged as an effective tool for preventing gridlock in traffic systems. However, tolls are currently mostly designed on route-based traffic assignment models (TAM), which may be unrealistic and computationally expensive. Existing approaches also impractically assume that the central tolling authority can access latency function parameters that characterize the time required to traverse each network arc (edge), as well as the entropy parameter $\beta$ that characterizes commuters' stochastic arc-selection decisions on the network. To address these issues, this work formulates an online learning algorithm that simultaneously refines estimates of linear arc latency functions and entropy parameters in an arc-based TAM, while implementing tolls on each arc to induce equilibrium flows that minimize overall congestion on the network. We prove that our algorithm incurs regret upper bounded by $O(\sqrt{T} \ln(T) |\arcsMod| \max\{|\nodesMod| \ln(|\arcsMod|/|\nodesMod|), B \})$, where $T$ denotes the total iteration count, $|\arcsMod|$ and $|\nodesMod|$ denote the total number of arcs and nodes in the network, respectively, and $B$ describes the number of arcs required to construct an estimate of $\beta$ (usually $\ll |I|$). Finally, we present numerical results on simulated traffic networks that validate our theoretical contributions.
Constrained Reinforcement Learning for Safe Heat Pump Control
Constrained Reinforcement Learning (RL) has emerged as a significant research area within RL, where integrating constraints with rewards is crucial for enhancing safety and performance across diverse control tasks. In the context of heating systems in the buildings, optimizing the energy efficiency while maintaining the residents' thermal comfort can be intuitively formulated as a constrained optimization problem. However, to solve it with RL may require large amount of data. Therefore, an accurate and versatile simulator is favored. In this paper, we propose a novel building simulator I4B which provides interfaces for different usages and apply a model-free constrained RL algorithm named constrained Soft Actor-Critic with Linear Smoothed Log Barrier function (CSAC-LB) to the heating optimization problem. Benchmarking against baseline algorithms demonstrates CSAC-LB's efficiency in data exploration, constraint satisfaction and performance.
Generating peak-aware pseudo-measurements for low-voltage feeders using metadata of distribution system operators
Distribution system operators (DSOs) must cope with new challenges such as the reconstruction of distribution grids along climate neutrality pathways or the ability to manage and control consumption and generation in the grid. In order to meet the challenges, measurements within the distribution grid often form the basis for DSOs. Hence, it is an urgent problem that measurement devices are not installed in many low-voltage (LV) grids. In order to overcome this problem, we present an approach to estimate pseudo-measurements for non-measured LV feeders based on the metadata of the respective feeder using regression models. The feeder metadata comprise information about the number of grid connection points, the installed power of consumers and producers, and billing data in the downstream LV grid. Additionally, we use weather data, calendar data and timestamp information as model features. The existing measurements are used as model target. We extensively evaluate the estimated pseudo-measurements on a large real-world dataset with 2,323 LV feeders characterized by both consumption and feed-in. For this purpose, we introduce peak metrics inspired by the BigDEAL challenge for the peak magnitude, timing and shape for both consumption and feed-in. As regression models, we use XGBoost, a multilayer perceptron (MLP) and a linear regression (LR). We observe that XGBoost and MLP outperform the LR. Furthermore, the results show that the approach adapts to different weather, calendar and timestamp conditions and produces realistic load curves based on the feeder metadata. In the future, the approach can be adapted to other grid levels like substation transformers and can supplement research fields like load modeling, state estimation and LV load forecasting.
comment: 17 pages, 9 figures, 8 tables
Obstacle-Aware Quadrupedal Locomotion With Resilient Multi-Modal Reinforcement Learning
Quadrupedal robots hold promising potential for applications in navigating cluttered environments with resilience akin to their animal counterparts. However, their floating base configuration makes them vulnerable to real-world uncertainties, yielding substantial challenges in their locomotion control. Deep reinforcement learning has become one of the plausible alternatives for realizing a robust locomotion controller. However, the approaches that rely solely on proprioception sacrifice collision-free locomotion because they require front-feet contact to detect the presence of stairs to adapt the locomotion gait. Meanwhile, incorporating exteroception necessitates a precisely modeled map observed by exteroceptive sensors over a period of time. Therefore, this work proposes a novel method to fuse proprioception and exteroception featuring a resilient multi-modal reinforcement learning. The proposed method yields a controller that showcases agile locomotion performance on a quadrupedal robot over a myriad of real-world courses, including rough terrains, steep slopes, and high-rise stairs, while retaining its robustness against out-of-distribution situations.
comment: Under review. Project site is available at https://dreamwaqpp.github.io
Fine-Tuning Hybrid Physics-Informed Neural Networks for Vehicle Dynamics Model Estimation
Accurate dynamic modeling is critical for autonomous racing vehicles, especially during high-speed and agile maneuvers where precise motion prediction is essential for safety. Traditional parameter estimation methods face limitations such as reliance on initial guesses, labor-intensive fitting procedures, and complex testing setups. On the other hand, purely data-driven machine learning methods struggle to capture inherent physical constraints and typically require large datasets for optimal performance. To address these challenges, this paper introduces the Fine-Tuning Hybrid Dynamics (FTHD) method, which integrates supervised and unsupervised Physics-Informed Neural Networks (PINNs), combining physics-based modeling with data-driven techniques. FTHD fine-tunes a pre-trained Deep Dynamics Model (DDM) using a smaller training dataset, delivering superior performance compared to state-of-the-art methods such as the Deep Pacejka Model (DPM) and outperforming the original DDM. Furthermore, an Extended Kalman Filter (EKF) is embedded within FTHD (EKF-FTHD) to effectively manage noisy real-world data, ensuring accurate denoising while preserving the vehicle's essential physical characteristics. The proposed FTHD framework is validated through scaled simulations using the BayesRace Physics-based Simulator and full-scale real-world experiments from the Indy Autonomous Challenge. Results demonstrate that the hybrid approach significantly improves parameter estimation accuracy, even with reduced data, and outperforms existing models. EKF-FTHD enhances robustness by denoising real-world data while maintaining physical insights, representing a notable advancement in vehicle dynamics modeling for high-speed autonomous racing.
An Enhanced Semidefinite Relaxation Model Combined with Clique Graph Merging Strategy for Efficient AC Optimal Power Flow Solution
Semidefinite programming (SDP) is widely acknowledged as one of the most effective methods for deriving the tightest lower bounds of the optimal power flow (OPF) problems. In this paper, an enhanced semidefinite relaxation model that integrates tighter {\lambda}-based quadratic convex relaxation, valid inequalities, and optimality-based bound tightening algorithms derived in accordance with the branch thermal limit boundary surface into the SDP framework is presented to further tighten the lower bounds of the feasible region of OPF problems, effectively combining the advantages of these recent advancements. Additionally, the utilization of chordal decomposition in the complex matrix formulation of SDP can significantly accelerate the solution time. Notably, for the same SDP problem, different chordal decompositions can result in varying solution time. To address this problem, this paper proposes a clique graph merging strategy within the complex matrix SDP framework, which assesses clique sizes and the computational burden on interior-point solvers, as well as reducing the need for hyperparameter tuning and further enhancing the solution efficiency. Finally, the proposed hybrid relaxation model is evaluated using MATPOWER and PGLib-OPF test cases, demonstrating its effectiveness in reducing the optimality gap and validating its computational performance on test cases with up to 13659-node.
Methods for Mitigating Uncertainty in Real-Time Operations of a Connected Microgrid
In this paper, we compare the effectiveness of a two-stage control strategy for the energy management system (EMS) of a grid-connected microgrid under uncertain solar irradiance and load demand using a real-world dataset from an island in Southeast Asia (SEA). The first stage computes a day-ahead commitment for power profile exchanged with the main grid, while the second stage focuses on real-time controls to minimize the system operating cost. Given the challenges in accurately forecasting solar irradiance for a long time horizon, scenario-based stochastic programming (SP) is considered for the first stage. For the second stage, as the most recent weather conditions can be used, several methodologies to handle the uncertainties are investigated, including: (1) the rule-based method historically deployed on EMS, (2) model predictive controller (MPC) using either an explicit forecast or scenario-based stochastic forecast, and (3) Deep Reinforcement Learning (DRL) computing its own implicit forecast through a distribution of costs. Performances of these methodologies are compared in terms of precision with a reference control assuming perfect forecast -- i.e. representing the minimal achievable operation cost in theory. Obtained results show that MPC with a stochastic forecast outperforms MPC with a simple deterministic prediction. This suggests that using an explicit forecast, even within a short time window, is challenging. Using weather conditions can, however, be more efficient, as demonstrated by DRL (with implicit forecast), outperforming MPC with stochastic forecast by 1.3\%.
comment: Published in Sustainable Energy, Grids and Networks 2024
Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization
This paper investigates distributed zeroth-order optimization for smooth nonconvex problems. We propose a novel variance-reduced gradient estimator, which randomly renovates one orthogonal direction of the true gradient in each iteration while leveraging historical snapshots for variance correction. By integrating this estimator with gradient tracking mechanism, we address the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation that exists in current zeroth-order distributed optimization algorithms, which rely on either the 2-point or $2d$-point gradient estimators. We derive a convergence rate of $\mathcal{O}(d^{\frac{5}{2}}/m)$ for smooth nonconvex functions in terms of sampling number $m$ and problem dimension $d$. Numerical simulations comparing our algorithm with existing methods confirm the effectiveness and efficiency of the proposed gradient estimator.
Joint Trajectory Replanning for Mars Ascent Vehicle under Propulsion System Faults: A Suboptimal Learning-Based Warm-Start Approach
During the Mars ascent vehicle (MAV) launch missions, when encountering a thrust drop type of propulsion system fault problem, the general trajectory replanning methods relying on step-by-step judgments may fail to make timely decisions, potentially leading to mission failure. This paper proposes a suboptimal joint trajectory replanning (SJTR) method, which formulates the joint optimization problem of target orbit and flight trajectory after a fault within a convex optimization framework. By incorporating penalty coefficients for terminal constraints, the optimization solution adheres to the orbit redecision principle, thereby avoiding complex decision-making processes and resulting in a concise and rapid solution to the replanning problem. A learning-based warm-start scheme is proposed in conjunction with the designed SJTR method. Offline, a deep neural network (DNN) is trained using a dataset generated by the SJTR method. Online, the DNN provides initial guesses for the time optimization variables based on the current fault situation, enhancing the solving efficiency and reliability of the algorithm. Numerical simulations of the MAV flight scenario under the thrust drop faults are performed, and Monte Carlo experiments and case studies across all orbit types demonstrate the effectiveness of the proposed method.
Active Inverse Learning in Stackelberg Trajectory Games
Game-theoretic inverse learning is the problem of inferring a player's objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates best describes the follower's objective function. Instead of using passively observed trajectories like existing methods, we actively maximize the differences in the follower's trajectories under different hypotheses by optimizing the leader's control inputs. Compared with uniformly random inputs, the optimized inputs accelerate the convergence of the estimated probability of different hypotheses conditioned on the follower's trajectory. We demonstrate the proposed method in a receding-horizon repeated trajectory game and simulate the results using virtual TurtleBots in Gazebo.
comment: 7 pages, 3 figures, submitted to ACC 2025. Updated previous version with new experiments and figures
Fast Robust Monitoring for Signal Temporal Logic with Value Freezing Operators (STL*)
Researchers have previously proposed augmenting Signal Temporal Logic (STL) with the value freezing operator in order to express engineering properties that cannot be expressed in STL. This augmented logic is known as STL*. The previous algorithms for STL* monitoring were intractable, and did not scale formulae with nested freeze variables. We present offline discrete-time monitoring algorithms with an acceleration heuristic, both for Boolean monitoring as well as for quantitative robustness monitoring. The acceleration heuristic operates over time intervals where subformulae hold true, rather than over the original trace sample-points. We present experimental validation of our algorithms, the results show that our algorithms can monitor over long traces for formulae with two or three nested freeze variables. Our work is the first work with monitoring algorithm implementations for STL* formulae with nested freeze variables.
comment: Full version of MEMOCODE 2024 paper
Relax, Estimate, and Track: a Simple Battery State-of-charge and State-of-health Estimation Method
Battery management is a critical component of ubiquitous battery-powered energy systems, in which battery state-of-charge (SOC) and state-of-health (SOH) estimations are of crucial importance. Conventional SOC and SOH estimation methods, especially model-based methods, often lack accurate modeling of the open circuit voltage (OCV), have relatively high computational complexity, and lack theoretical analysis. This study introduces a simple SOC and SOH estimation method that overcomes all these weaknesses. The key idea of the proposed method is to momentarily set the cell's current to zero for a few minutes during the charging, perform SOC and SOH estimation based on the measured data, and continue tracking the cell's SOC afterward. The method is based on rigorous theoretical analysis, requires no hyperparameter fine-tuning, and is hundreds of times faster than conventional model-based methods. The method is validated on six batteries charged at different C rates and temperatures, realizing fast and accurate estimations under various conditions, with a SOH root mean square error (RMSE) of around 3% and a SOC RMSE of around 1.5%.
comment: Minor changes to texts and figures
Systems and Control (EESS)
Generalizability of Graph Neural Networks for Decentralized Unlabeled Motion Planning ICRA 2025
Unlabeled motion planning involves assigning a set of robots to target locations while ensuring collision avoidance, aiming to minimize the total distance traveled. The problem forms an essential building block for multi-robot systems in applications such as exploration, surveillance, and transportation. We address this problem in a decentralized setting where each robot knows only the positions of its $k$-nearest robots and $k$-nearest targets. This scenario combines elements of combinatorial assignment and continuous-space motion planning, posing significant scalability challenges for traditional centralized approaches. To overcome these challenges, we propose a decentralized policy learned via a Graph Neural Network (GNN). The GNN enables robots to determine (1) what information to communicate to neighbors and (2) how to integrate received information with local observations for decision-making. We train the GNN using imitation learning with the centralized Hungarian algorithm as the expert policy, and further fine-tune it using reinforcement learning to avoid collisions and enhance performance. Extensive empirical evaluations demonstrate the scalability and effectiveness of our approach. The GNN policy trained on 100 robots generalizes to scenarios with up to 500 robots, outperforming state-of-the-art solutions by 8.6\% on average and significantly surpassing greedy decentralized methods. This work lays the foundation for solving multi-robot coordination problems in settings where scalability is important.
comment: 6 pages, 6 figures, submitted to ICRA 2025
Energy Saving and Traffic Steering Use Case and Testing by O-RAN RIC xApp/rApp Multi-vendor Interoperability
This paper discusses the use case of energy saving and traffic steering in O-RAN, the mechanism of multi-vendor interoperability to make it work and depict its test methodology.
comment: 6 pages, 8 figures
Adaptive Event-triggered Reinforcement Learning Control for Complex Nonlinear Systems
In this paper, we propose an adaptive event-triggered reinforcement learning control for continuous-time nonlinear systems, subject to bounded uncertainties, characterized by complex interactions. Specifically, the proposed method is capable of jointly learning both the control policy and the communication policy, thereby reducing the number of parameters and computational overhead when learning them separately or only one of them. By augmenting the state space with accrued rewards that represent the performance over the entire trajectory, we show that accurate and efficient determination of triggering conditions is possible without the need for explicit learning triggering conditions, thereby leading to an adaptive non-stationary policy. Finally, we provide several numerical examples to demonstrate the effectiveness of the proposed approach.
Parameter Estimation in Optimal Tolling for Traffic Networks Under the Markovian Traffic Equilibrium
Tolling, or congestion pricing, has emerged as an effective tool for preventing gridlock in traffic systems. However, tolls are currently mostly designed on route-based traffic assignment models (TAM), which may be unrealistic and computationally expensive. Existing approaches also impractically assume that the central tolling authority can access latency function parameters that characterize the time required to traverse each network arc (edge), as well as the entropy parameter $\beta$ that characterizes commuters' stochastic arc-selection decisions on the network. To address these issues, this work formulates an online learning algorithm that simultaneously refines estimates of linear arc latency functions and entropy parameters in an arc-based TAM, while implementing tolls on each arc to induce equilibrium flows that minimize overall congestion on the network. We prove that our algorithm incurs regret upper bounded by $O(\sqrt{T} \ln(T) |\arcsMod| \max\{|\nodesMod| \ln(|\arcsMod|/|\nodesMod|), B \})$, where $T$ denotes the total iteration count, $|\arcsMod|$ and $|\nodesMod|$ denote the total number of arcs and nodes in the network, respectively, and $B$ describes the number of arcs required to construct an estimate of $\beta$ (usually $\ll |I|$). Finally, we present numerical results on simulated traffic networks that validate our theoretical contributions.
Constrained Reinforcement Learning for Safe Heat Pump Control
Constrained Reinforcement Learning (RL) has emerged as a significant research area within RL, where integrating constraints with rewards is crucial for enhancing safety and performance across diverse control tasks. In the context of heating systems in the buildings, optimizing the energy efficiency while maintaining the residents' thermal comfort can be intuitively formulated as a constrained optimization problem. However, to solve it with RL may require large amount of data. Therefore, an accurate and versatile simulator is favored. In this paper, we propose a novel building simulator I4B which provides interfaces for different usages and apply a model-free constrained RL algorithm named constrained Soft Actor-Critic with Linear Smoothed Log Barrier function (CSAC-LB) to the heating optimization problem. Benchmarking against baseline algorithms demonstrates CSAC-LB's efficiency in data exploration, constraint satisfaction and performance.
Generating peak-aware pseudo-measurements for low-voltage feeders using metadata of distribution system operators
Distribution system operators (DSOs) must cope with new challenges such as the reconstruction of distribution grids along climate neutrality pathways or the ability to manage and control consumption and generation in the grid. In order to meet the challenges, measurements within the distribution grid often form the basis for DSOs. Hence, it is an urgent problem that measurement devices are not installed in many low-voltage (LV) grids. In order to overcome this problem, we present an approach to estimate pseudo-measurements for non-measured LV feeders based on the metadata of the respective feeder using regression models. The feeder metadata comprise information about the number of grid connection points, the installed power of consumers and producers, and billing data in the downstream LV grid. Additionally, we use weather data, calendar data and timestamp information as model features. The existing measurements are used as model target. We extensively evaluate the estimated pseudo-measurements on a large real-world dataset with 2,323 LV feeders characterized by both consumption and feed-in. For this purpose, we introduce peak metrics inspired by the BigDEAL challenge for the peak magnitude, timing and shape for both consumption and feed-in. As regression models, we use XGBoost, a multilayer perceptron (MLP) and a linear regression (LR). We observe that XGBoost and MLP outperform the LR. Furthermore, the results show that the approach adapts to different weather, calendar and timestamp conditions and produces realistic load curves based on the feeder metadata. In the future, the approach can be adapted to other grid levels like substation transformers and can supplement research fields like load modeling, state estimation and LV load forecasting.
comment: 17 pages, 9 figures, 8 tables
Obstacle-Aware Quadrupedal Locomotion With Resilient Multi-Modal Reinforcement Learning
Quadrupedal robots hold promising potential for applications in navigating cluttered environments with resilience akin to their animal counterparts. However, their floating base configuration makes them vulnerable to real-world uncertainties, yielding substantial challenges in their locomotion control. Deep reinforcement learning has become one of the plausible alternatives for realizing a robust locomotion controller. However, the approaches that rely solely on proprioception sacrifice collision-free locomotion because they require front-feet contact to detect the presence of stairs to adapt the locomotion gait. Meanwhile, incorporating exteroception necessitates a precisely modeled map observed by exteroceptive sensors over a period of time. Therefore, this work proposes a novel method to fuse proprioception and exteroception featuring a resilient multi-modal reinforcement learning. The proposed method yields a controller that showcases agile locomotion performance on a quadrupedal robot over a myriad of real-world courses, including rough terrains, steep slopes, and high-rise stairs, while retaining its robustness against out-of-distribution situations.
comment: Under review. Project site is available at https://dreamwaqpp.github.io
Fine-Tuning Hybrid Physics-Informed Neural Networks for Vehicle Dynamics Model Estimation
Accurate dynamic modeling is critical for autonomous racing vehicles, especially during high-speed and agile maneuvers where precise motion prediction is essential for safety. Traditional parameter estimation methods face limitations such as reliance on initial guesses, labor-intensive fitting procedures, and complex testing setups. On the other hand, purely data-driven machine learning methods struggle to capture inherent physical constraints and typically require large datasets for optimal performance. To address these challenges, this paper introduces the Fine-Tuning Hybrid Dynamics (FTHD) method, which integrates supervised and unsupervised Physics-Informed Neural Networks (PINNs), combining physics-based modeling with data-driven techniques. FTHD fine-tunes a pre-trained Deep Dynamics Model (DDM) using a smaller training dataset, delivering superior performance compared to state-of-the-art methods such as the Deep Pacejka Model (DPM) and outperforming the original DDM. Furthermore, an Extended Kalman Filter (EKF) is embedded within FTHD (EKF-FTHD) to effectively manage noisy real-world data, ensuring accurate denoising while preserving the vehicle's essential physical characteristics. The proposed FTHD framework is validated through scaled simulations using the BayesRace Physics-based Simulator and full-scale real-world experiments from the Indy Autonomous Challenge. Results demonstrate that the hybrid approach significantly improves parameter estimation accuracy, even with reduced data, and outperforms existing models. EKF-FTHD enhances robustness by denoising real-world data while maintaining physical insights, representing a notable advancement in vehicle dynamics modeling for high-speed autonomous racing.
An Enhanced Semidefinite Relaxation Model Combined with Clique Graph Merging Strategy for Efficient AC Optimal Power Flow Solution
Semidefinite programming (SDP) is widely acknowledged as one of the most effective methods for deriving the tightest lower bounds of the optimal power flow (OPF) problems. In this paper, an enhanced semidefinite relaxation model that integrates tighter {\lambda}-based quadratic convex relaxation, valid inequalities, and optimality-based bound tightening algorithms derived in accordance with the branch thermal limit boundary surface into the SDP framework is presented to further tighten the lower bounds of the feasible region of OPF problems, effectively combining the advantages of these recent advancements. Additionally, the utilization of chordal decomposition in the complex matrix formulation of SDP can significantly accelerate the solution time. Notably, for the same SDP problem, different chordal decompositions can result in varying solution time. To address this problem, this paper proposes a clique graph merging strategy within the complex matrix SDP framework, which assesses clique sizes and the computational burden on interior-point solvers, as well as reducing the need for hyperparameter tuning and further enhancing the solution efficiency. Finally, the proposed hybrid relaxation model is evaluated using MATPOWER and PGLib-OPF test cases, demonstrating its effectiveness in reducing the optimality gap and validating its computational performance on test cases with up to 13659-node.
Methods for Mitigating Uncertainty in Real-Time Operations of a Connected Microgrid
In this paper, we compare the effectiveness of a two-stage control strategy for the energy management system (EMS) of a grid-connected microgrid under uncertain solar irradiance and load demand using a real-world dataset from an island in Southeast Asia (SEA). The first stage computes a day-ahead commitment for power profile exchanged with the main grid, while the second stage focuses on real-time controls to minimize the system operating cost. Given the challenges in accurately forecasting solar irradiance for a long time horizon, scenario-based stochastic programming (SP) is considered for the first stage. For the second stage, as the most recent weather conditions can be used, several methodologies to handle the uncertainties are investigated, including: (1) the rule-based method historically deployed on EMS, (2) model predictive controller (MPC) using either an explicit forecast or scenario-based stochastic forecast, and (3) Deep Reinforcement Learning (DRL) computing its own implicit forecast through a distribution of costs. Performances of these methodologies are compared in terms of precision with a reference control assuming perfect forecast -- i.e. representing the minimal achievable operation cost in theory. Obtained results show that MPC with a stochastic forecast outperforms MPC with a simple deterministic prediction. This suggests that using an explicit forecast, even within a short time window, is challenging. Using weather conditions can, however, be more efficient, as demonstrated by DRL (with implicit forecast), outperforming MPC with stochastic forecast by 1.3\%.
comment: Published in Sustainable Energy, Grids and Networks 2024
Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization
This paper investigates distributed zeroth-order optimization for smooth nonconvex problems. We propose a novel variance-reduced gradient estimator, which randomly renovates one orthogonal direction of the true gradient in each iteration while leveraging historical snapshots for variance correction. By integrating this estimator with gradient tracking mechanism, we address the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation that exists in current zeroth-order distributed optimization algorithms, which rely on either the 2-point or $2d$-point gradient estimators. We derive a convergence rate of $\mathcal{O}(d^{\frac{5}{2}}/m)$ for smooth nonconvex functions in terms of sampling number $m$ and problem dimension $d$. Numerical simulations comparing our algorithm with existing methods confirm the effectiveness and efficiency of the proposed gradient estimator.
Joint Trajectory Replanning for Mars Ascent Vehicle under Propulsion System Faults: A Suboptimal Learning-Based Warm-Start Approach
During the Mars ascent vehicle (MAV) launch missions, when encountering a thrust drop type of propulsion system fault problem, the general trajectory replanning methods relying on step-by-step judgments may fail to make timely decisions, potentially leading to mission failure. This paper proposes a suboptimal joint trajectory replanning (SJTR) method, which formulates the joint optimization problem of target orbit and flight trajectory after a fault within a convex optimization framework. By incorporating penalty coefficients for terminal constraints, the optimization solution adheres to the orbit redecision principle, thereby avoiding complex decision-making processes and resulting in a concise and rapid solution to the replanning problem. A learning-based warm-start scheme is proposed in conjunction with the designed SJTR method. Offline, a deep neural network (DNN) is trained using a dataset generated by the SJTR method. Online, the DNN provides initial guesses for the time optimization variables based on the current fault situation, enhancing the solving efficiency and reliability of the algorithm. Numerical simulations of the MAV flight scenario under the thrust drop faults are performed, and Monte Carlo experiments and case studies across all orbit types demonstrate the effectiveness of the proposed method.
Active Inverse Learning in Stackelberg Trajectory Games
Game-theoretic inverse learning is the problem of inferring a player's objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates best describes the follower's objective function. Instead of using passively observed trajectories like existing methods, we actively maximize the differences in the follower's trajectories under different hypotheses by optimizing the leader's control inputs. Compared with uniformly random inputs, the optimized inputs accelerate the convergence of the estimated probability of different hypotheses conditioned on the follower's trajectory. We demonstrate the proposed method in a receding-horizon repeated trajectory game and simulate the results using virtual TurtleBots in Gazebo.
comment: 7 pages, 3 figures, submitted to ACC 2025. Updated previous version with new experiments and figures
Fast Robust Monitoring for Signal Temporal Logic with Value Freezing Operators (STL*)
Researchers have previously proposed augmenting Signal Temporal Logic (STL) with the value freezing operator in order to express engineering properties that cannot be expressed in STL. This augmented logic is known as STL*. The previous algorithms for STL* monitoring were intractable, and did not scale formulae with nested freeze variables. We present offline discrete-time monitoring algorithms with an acceleration heuristic, both for Boolean monitoring as well as for quantitative robustness monitoring. The acceleration heuristic operates over time intervals where subformulae hold true, rather than over the original trace sample-points. We present experimental validation of our algorithms, the results show that our algorithms can monitor over long traces for formulae with two or three nested freeze variables. Our work is the first work with monitoring algorithm implementations for STL* formulae with nested freeze variables.
comment: Full version of MEMOCODE 2024 paper
Relax, Estimate, and Track: a Simple Battery State-of-charge and State-of-health Estimation Method
Battery management is a critical component of ubiquitous battery-powered energy systems, in which battery state-of-charge (SOC) and state-of-health (SOH) estimations are of crucial importance. Conventional SOC and SOH estimation methods, especially model-based methods, often lack accurate modeling of the open circuit voltage (OCV), have relatively high computational complexity, and lack theoretical analysis. This study introduces a simple SOC and SOH estimation method that overcomes all these weaknesses. The key idea of the proposed method is to momentarily set the cell's current to zero for a few minutes during the charging, perform SOC and SOH estimation based on the measured data, and continue tracking the cell's SOC afterward. The method is based on rigorous theoretical analysis, requires no hyperparameter fine-tuning, and is hundreds of times faster than conventional model-based methods. The method is validated on six batteries charged at different C rates and temperatures, realizing fast and accurate estimations under various conditions, with a SOH root mean square error (RMSE) of around 3% and a SOC RMSE of around 1.5%.
comment: Minor changes to texts and figures
Robotics
SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models
Despite significant advancements in large language models (LLMs) that enhance robot agents' understanding and execution of natural language (NL) commands, ensuring the agents adhere to user-specified constraints remains challenging, particularly for complex commands and long-horizon tasks. To address this challenge, we present three key insights, equivalence voting, constrained decoding, and domain-specific fine-tuning, which significantly enhance LLM planners' capability in handling complex tasks. Equivalence voting ensures consistency by generating and sampling multiple Linear Temporal Logic (LTL) formulas from NL commands, grouping equivalent LTL formulas, and selecting the majority group of formulas as the final LTL formula. Constrained decoding then uses the generated LTL formula to enforce the autoregressive inference of plans, ensuring the generated plans conform to the LTL. Domain-specific fine-tuning customizes LLMs to produce safe and efficient plans within specific task domains. Our approach, Safe Efficient LLM Planner (SELP), combines these insights to create LLM planners to generate plans adhering to user commands with high confidence. We demonstrate the effectiveness and generalizability of SELP across different robot agents and tasks, including drone navigation and robot manipulation. For drone navigation tasks, SELP outperforms state-of-the-art planners by 10.8% in safety rate (i.e., finishing tasks conforming to NL commands) and by 19.8% in plan efficiency. For robot manipulation tasks, SELP achieves 20.4% improvement in safety rate. Our datasets for evaluating NL-to-LTL and robot task planning will be released in github.com/lt-asset/selp.
Robot Guided Evacuation with Viewpoint Constraints IROS 2024
We present a viewpoint-based non-linear Model Predictive Control (MPC) for evacuation guiding robots. Specifically, the proposed MPC algorithm enables evacuation guiding robots to track and guide cooperative human targets in emergency scenarios. Our algorithm accounts for the environment layout as well as distances between the robot and human target and distance to the goal location. A key challenge for evacuation guiding robot is the trade-off between its planned motion for leading the target toward a goal position and staying in the target's viewpoint while maintaining line-of-sight for guiding. We illustrate the effectiveness of our proposed evacuation guiding algorithm in both simulated and real-world environments with an Unmanned Aerial Vehicle (UAV) guiding a human. Our results suggest that using the contextual information from the environment for motion planning, increases the visibility of the guiding UAV to the human while achieving faster total evacuation time.
comment: In proceedings of IEEE/RSJ IROS 2024
Language-guided Robust Navigation for Mobile Robots in Dynamically-changing Environments
In this paper, we develop an embodied AI system for human-in-the-loop navigation with a wheeled mobile robot. We propose a direct yet effective method of monitoring the robot's current plan to detect changes in the environment that impact the intended trajectory of the robot significantly and then query a human for feedback. We also develop a means to parse human feedback expressed in natural language into local navigation waypoints and integrate it into a global planning system, by leveraging a map of semantic features and an aligned obstacle map. Extensive testing in simulation and physical hardware experiments with a resource-constrained wheeled robot tasked to navigate in a real-world environment validate the efficacy and robustness of our method. This work can support applications like precision agriculture and construction, where persistent monitoring of the environment provides a human with information about the environment state.
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping ICRA 2025
The language-guided robot grasping task requires a robot agent to integrate multimodal information from both visual and linguistic inputs to predict actions for target-driven grasping. While recent approaches utilizing Multimodal Large Language Models (MLLMs) have shown promising results, their extensive computation and data demands limit the feasibility of local deployment and customization. To address this, we propose a novel CLIP-based multimodal parameter-efficient tuning (PET) framework designed for three language-guided object grounding and grasping tasks: (1) Referring Expression Segmentation (RES), (2) Referring Grasp Synthesis (RGS), and (3) Referring Grasp Affordance (RGA). Our approach introduces two key innovations: a bi-directional vision-language adapter that aligns multimodal inputs for pixel-level language understanding and a depth fusion branch that incorporates geometric cues to facilitate robot grasping predictions. Experiment results demonstrate superior performance in the RES object grounding task compared with existing CLIP-based full-model tuning or PET approaches. In the RGS and RGA tasks, our model not only effectively interprets object attributes based on simple language descriptions but also shows strong potential for comprehending complex spatial reasoning scenarios, such as multiple identical objects present in the workspace.
comment: This work has been submitted to ICRA 2025
The Importance of Adaptive Decision-Making for Autonomous Long-Range Planetary Surface Mobility
Long-distance driving is an important component of planetary surface exploration. Unforeseen events often require human operators to adjust mobility plans, but this approach does not scale and will be insufficient for future missions. Interest in self-reliant rovers is increasing, however the research community has not yet given significant attention to autonomous, adaptive decision-making. In this paper, we look back at specific planetary mobility operations where human-guided adaptive planning played an important role in mission safety and productivity. Inspired by the abilities of human experts, we identify shortcomings of existing autonomous mobility algorithms for robots operating in off-road environments like planetary surfaces. We advocate for adaptive decision-making capabilities such as unassisted learning from past experiences and more reliance on stochastic world models. The aim of this work is to highlight promising research avenues to enhance ground planning tools and, ultimately, long-range autonomy algorithms on board planetary rovers.
comment: Accepted to the International Symposium on Artificial Intelligence, Robotics and Automation in Space (i-SAIRAS'24), Brisbane, Australia, Nov. 19-21, 2024
G3R: Gradient Guided Generalizable Reconstruction ECCV 2024
Large scale 3D scene reconstruction is important for applications such as virtual reality and simulation. Existing neural rendering approaches (e.g., NeRF, 3DGS) have achieved realistic reconstructions on large scenes, but optimize per scene, which is expensive and slow, and exhibit noticeable artifacts under large view changes due to overfitting. Generalizable approaches or large reconstruction models are fast, but primarily work for small scenes/objects and often produce lower quality rendering results. In this work, we introduce G3R, a generalizable reconstruction approach that can efficiently predict high-quality 3D scene representations for large scenes. We propose to learn a reconstruction network that takes the gradient feedback signals from differentiable rendering to iteratively update a 3D scene representation, combining the benefits of high photorealism from per-scene optimization with data-driven priors from fast feed-forward prediction methods. Experiments on urban-driving and drone datasets show that G3R generalizes across diverse large scenes and accelerates the reconstruction process by at least 10x while achieving comparable or better realism compared to 3DGS, and also being more robust to large view changes.
comment: ECCV 2024. Project page: https://waabi.ai/g3r
Steering Prediction via a Multi-Sensor System for Autonomous Racing
Autonomous racing has rapidly gained research attention. Traditionally, racing cars rely on 2D LiDAR as their primary visual system. In this work, we explore the integration of an event camera with the existing system to provide enhanced temporal information. Our goal is to fuse the 2D LiDAR data with event data in an end-to-end learning framework for steering prediction, which is crucial for autonomous racing. To the best of our knowledge, this is the first study addressing this challenging research topic. We start by creating a multisensor dataset specifically for steering prediction. Using this dataset, we establish a benchmark by evaluating various SOTA fusion methods. Our observations reveal that existing methods often incur substantial computational costs. To address this, we apply low-rank techniques to propose a novel, efficient, and effective fusion design. We introduce a new fusion learning policy to guide the fusion process, enhancing robustness against misalignment. Our fusion architecture provides better steering prediction than LiDAR alone, significantly reducing the RMSE from 7.72 to 1.28. Compared to the second-best fusion method, our work represents only 11% of the learnable parameters while achieving better accuracy. The source code, dataset, and benchmark will be released to promote future research.
Intelligent Fish Detection System with Similarity-Aware Transformer
Fish detection in water-land transfer has significantly contributed to the fishery. However, manual fish detection in crowd-collaboration performs inefficiently and expensively, involving insufficient accuracy. To further enhance the water-land transfer efficiency, improve detection accuracy, and reduce labor costs, this work designs a new type of lightweight and plug-and-play edge intelligent vision system to automatically conduct fast fish detection with high-speed camera. Moreover, a novel similarity-aware vision Transformer for fast fish detection (FishViT) is proposed to onboard identify every single fish in a dense and similar group. Specifically, a novel similarity-aware multi-level encoder is developed to enhance multi-scale features in parallel, thereby yielding discriminative representations for varying-size fish. Additionally, a new soft-threshold attention mechanism is introduced, which not only effectively eliminates background noise from images but also accurately recognizes both the edge details and overall features of different similar fish. 85 challenging video sequences with high framerate and high-resolution are collected to establish a benchmark from real fish water-land transfer scenarios. Exhaustive evaluation conducted with this challenging benchmark has proved the robustness and effectiveness of FishViT with over 80 FPS. Real work scenario tests validate the practicality of the proposed method. The code and demo video are available at https://github.com/vision4robotics/FishViT.
Gesture Recognition for Feedback Based Mixed Reality and Robotic Fabrication: A Case Study of the UnLog Tower
Mixed Reality (MR) platforms enable users to interact with three-dimensional holographic instructions during the assembly and fabrication of highly custom and parametric architectural constructions without the necessity of two-dimensional drawings. Previous MR fabrication projects have primarily relied on digital menus and custom buttons as the interface for user interaction with the MR environment. Despite this approach being widely adopted, it is limited in its ability to allow for direct human interaction with physical objects to modify fabrication instructions within the MR environment. This research integrates user interactions with physical objects through real-time gesture recognition as input to modify, update or generate new digital information enabling reciprocal stimuli between the physical and the virtual environment. Consequently, the digital environment is generative of the user's provided interaction with physical objects to allow seamless feedback in the fabrication process. This research investigates gesture recognition for feedback-based MR workflows for robotic fabrication, human assembly, and quality control in the construction of the UnLog Tower.
comment: 16 pages, 16 figures. Published in the Proceedings of the International Conference on Computational Design and Robotic Fabrication (CDRF) 2023
Symmetry Preservation in Swarms of Oblivious Robots with Limited Visibility
In the general pattern formation (GPF) problem, a swarm of simple autonomous, disoriented robots must form a given pattern. The robots' simplicity imply a strong limitation: When the initial configuration is rotationally symmetric, only patterns with a similar symmetry can be formed [Yamashita, Suzyuki; TCS 2010]. The only known algorithm to form large patterns with limited visibility and without memory requires the robots to start in a near-gathering (a swarm of constant diameter) [Hahn et al.; SAND 2024]. However, not only do we not know any near-gathering algorithm guaranteed to preserve symmetry but most natural gathering strategies trivially increase symmetries [Castenow et al.; OPODIS 2022]. Thus, we study near-gathering without changing the swarm's rotational symmetry for disoriented, oblivious robots with limited visibility (the OBLOT-model, see [Flocchini et al.; 2019]). We introduce a technique based on the theory of dynamical systems to analyze how a given algorithm affects symmetry and provide sufficient conditions for symmetry preservation. Until now, it was unknown whether the considered OBLOT-model allows for any non-trivial algorithm that always preserves symmetry. Our first result shows that a variant of Go-to-the-Average always preserves symmetry but may sometimes lead to multiple, unconnected near-gathering clusters. Our second result is a symmetry-preserving near-gathering algorithm that works on swarms with a convex boundary (the outer boundary of the unit disc graph) and without holes (circles of diameter 1 inside the boundary without any robots).
Fast and Accurate Task Planning using Neuro-Symbolic Language Models and Multi-level Goal Decomposition
In robotic task planning, symbolic planners using rule-based representations like PDDL are effective but struggle with long-sequential tasks in complicated planning environments due to exponentially increasing search space. Recently, Large Language Models (LLMs) based on artificial neural networks have emerged as promising alternatives for autonomous robot task planning, offering faster inference and leveraging commonsense knowledge. However, they typically suffer from lower success rates. In this paper, to address the limitations of the current symbolic (slow speed) or LLM-based approaches (low accuracy), we propose a novel neuro-symbolic task planner that decomposes complex tasks into subgoals using LLM and carries out task planning for each subgoal using either symbolic or MCTS-based LLM planners, depending on the subgoal complexity. Generating subgoals helps reduce planning time and improve success rates by narrowing the overall search space and enabling LLMs to focus on smaller, more manageable tasks. Our method significantly reduces planning time while maintaining a competitive success rate, as demonstrated through experiments in different public task planning domains, as well as real-world and simulated robotics environments.
Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and Reinforcement Learning
The real world is unpredictable. Therefore, to solve long-horizon decision-making problems with autonomous robots, we must construct agents that are capable of adapting to changes in the environment during deployment. Model-based planning approaches can enable robots to solve complex, long-horizon tasks in a variety of environments. However, such approaches tend to be brittle when deployed into an environment featuring a novel situation that their underlying model does not account for. In this work, we propose to learn a ``bridge policy'' via Reinforcement Learning (RL) to adapt to such novelties. We introduce a simple formulation for such learning, where the RL problem is constructed with a special ``CallPlanner'' action that terminates the bridge policy and hands control of the agent back to the planner. This allows the RL policy to learn the set of states in which querying the planner and following the returned plan will achieve the goal. We show that this formulation enables the agent to rapidly learn by leveraging the planner's knowledge to avoid challenging long-horizon exploration caused by sparse reward. In experiments across three different simulated domains of varying complexity, we demonstrate that our approach is able to learn policies that adapt to novelty more efficiently than several baselines, including a pure RL baseline. We also demonstrate that the learned bridge policy is generalizable in that it can be combined with the planner to enable the agent to solve more complex tasks with multiple instances of the encountered novelty.
RAIL: Reachability-Aided Imitation Learning for Safe Policy Execution
Imitation learning (IL) has shown great success in learning complex robot manipulation tasks. However, there remains a need for practical safety methods to justify widespread deployment. In particular, it is important to certify that a system obeys hard constraints on unsafe behavior in settings when it is unacceptable to design a tradeoff between performance and safety via tuning the policy (i.e. soft constraints). This leads to the question, how does enforcing hard constraints impact the performance (meaning safely completing tasks) of an IL policy? To answer this question, this paper builds a reachability-based safety filter to enforce hard constraints on IL, which we call Reachability-Aided Imitation Learning (RAIL). Through evaluations with state-of-the-art IL policies in mobile robots and manipulation tasks, we make two key findings. First, the highest-performing policies are sometimes only so because they frequently violate constraints, and significantly lose performance under hard constraints. Second, surprisingly, hard constraints on the lower-performing policies can occasionally increase their ability to perform tasks safely. Finally, hardware evaluation confirms the method can operate in real time.
comment: * denotes equal contribution
Complete and Near-Optimal Robotic Crack Coverage and Filling in Civil Infrastructure
We present a simultaneous sensor-based inspection and footprint coverage (SIFC) planning and control design with applications to autonomous robotic crack mapping and filling. The main challenge of the SIFC problem lies in the coupling of complete sensing (for mapping) and robotic footprint (for filling) coverage tasks. Initially, we assume known target information (e.g., cracks) and employ classic cell decomposition methods to achieve complete sensing coverage of the workspace and complete robotic footprint coverage using the least-cost route. Subsequently, we generalize the algorithm to handle unknown target information, allowing the robot to scan and incrementally construct the target map online while conducting robotic footprint coverage. The online polynomial-time SIFC planning algorithm minimizes the total robot traveling distance, guarantees complete sensing coverage of the entire workspace, and achieves near-optimal robotic footprint coverage, as demonstrated through experiments. For the demonstrated application, we design coordinated nozzle motion control with the planned robot trajectory to efficiently fill all cracks within the robot's footprint. Experimental results illustrate the algorithm's design, performance, and comparisons. The SIFC algorithm offers a high-efficiency motion planning solution for various robotic applications requiring simultaneous sensing and actuation coverage.
SplatSim: Zero-Shot Sim2Real Transfer of RGB Manipulation Policies Using Gaussian Splatting
Sim2Real transfer, particularly for manipulation policies relying on RGB images, remains a critical challenge in robotics due to the significant domain shift between synthetic and real-world visual data. In this paper, we propose SplatSim, a novel framework that leverages Gaussian Splatting as the primary rendering primitive to reduce the Sim2Real gap for RGB-based manipulation policies. By replacing traditional mesh representations with Gaussian Splats in simulators, SplatSim produces highly photorealistic synthetic data while maintaining the scalability and cost-efficiency of simulation. We demonstrate the effectiveness of our framework by training manipulation policies within SplatSim and deploying them in the real world in a zero-shot manner, achieving an average success rate of 86.25%, compared to 97.5% for policies trained on real-world data. Videos can be found on our project page: https://splatsim.github.io
KinScene: Model-Based Mobile Manipulation of Articulated Scenes
Sequentially interacting with articulated objects is crucial for a mobile manipulator to operate effectively in everyday environments. To enable long-horizon tasks involving articulated objects, this study explores building scene-level articulation models for indoor scenes through autonomous exploration. While previous research has studied mobile manipulation with articulated objects by considering object kinematic constraints, it primarily focuses on individual-object scenarios and lacks extension to a scene-level context for task-level planning. To manipulate multiple object parts sequentially, the robot needs to reason about the resultant motion of each part and anticipate its impact on future actions. We introduce KinScene, a full-stack approach for long-horizon manipulation tasks with articulated objects. The robot maps the scene, detects and physically interacts with articulated objects, collects observations, and infers the articulation properties. For sequential tasks, the robot plans a feasible series of object interactions based on the inferred articulation model. We demonstrate that our approach repeatably constructs accurate scene-level kinematic and geometric models, enabling long-horizon mobile manipulation in a real-world scene. Code and additional results are available at https://chengchunhsu.github.io/KinScene/
Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy
For reliable autonomous robot navigation in urban settings, the robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene. This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain. This fine-tuning process often involves manual data collection with the target robot and annotation by human labelers which is prohibitively expensive and unscalable. In this work, we present an effective methodology for training a semantic traversability estimator using egocentric videos and an automated annotation process. Egocentric videos are collected from a camera mounted on a pedestrian's chest. The dataset for training the semantic traversability estimator is then automatically generated by extracting semantically traversable regions in each video frame using a recent foundation model in image segmentation and its prompting technique. Extensive experiments with videos taken across several countries and cities, covering diverse urban scenarios, demonstrate the high scalability and generalizability of the proposed annotation method. Furthermore, performance analysis and real-world deployment for autonomous robot navigation showcase that the trained semantic traversability estimator is highly accurate, able to handle diverse camera viewpoints, computationally light, and real-world applicable. The summary video is available at https://youtu.be/EUVoH-wA-lA.
comment: Accepted to IEEE Robotics and Automation Letters (RA-L) 2024, First two authors contributed equally
Certifiably Correct Range-Aided SLAM
We present the first algorithm to efficiently compute certifiably optimal solutions to range-aided simultaneous localization and mapping (RA-SLAM) problems. Robotic navigation systems increasingly incorporate point-to-point ranging sensors, leading to state estimation problems in the form of RA-SLAM. However, the RA-SLAM problem is significantly more difficult to solve than traditional pose-graph SLAM: ranging sensor models introduce non-convexity and single range measurements do not uniquely determine the transform between the involved sensors. As a result, RA-SLAM inference is sensitive to initial estimates yet lacks reliable initialization techniques. Our approach, certifiably correct RA-SLAM (CORA), leverages a novel quadratically constrained quadratic programming (QCQP) formulation of RA-SLAM to relax the RA-SLAM problem to a semidefinite program (SDP). CORA solves the SDP efficiently using the Riemannian Staircase methodology; the SDP solution provides both (i) a lower bound on the RA-SLAM problem's optimal value, and (ii) an approximate solution of the RA-SLAM problem, which can be subsequently refined using local optimization. CORA applies to problems with arbitrary pose-pose, pose-landmark, and ranging measurements and, due to using convex relaxation, is insensitive to initialization. We evaluate CORA on several real-world problems. In contrast to state-of-the-art approaches, CORA is able to obtain high-quality solutions on all problems despite being initialized with random values. Additionally, we study the tightness of the SDP relaxation with respect to important problem parameters: the number of (i) robots, (ii) landmarks, and (iii) range measurements. These experiments demonstrate that the SDP relaxation is often tight and reveal relationships between graph connectivity and the tightness of the SDP relaxation.
comment: Accepted to Transactions on Robotics (T-RO)
Generalizable whole-body global manipulation of deformable linear objects by dual-arm robot in 3-D constrained environments
Constrained environments are common in practical applications of manipulating deformable linear objects (DLOs), where movements of both DLOs and robots should be constrained. This task is high-dimensional and highly constrained owing to the highly deformable DLOs, dual-arm robots with high degrees of freedom, and 3-D complex environments, which render global planning challenging. Furthermore, accurate DLO models needed by planning are often unavailable owing to their strong nonlinearity and diversity, resulting in unreliable planned paths. This article focuses on the global moving and shaping of DLOs in constrained environments by dual-arm robots. The main objectives are 1) to efficiently and accurately accomplish this task, and 2) to achieve generalizable and robust manipulation of various DLOs. To this end, we propose a complementary framework with whole-body planning and control using appropriate DLO model representations. First, a global planner is proposed to efficiently find feasible solutions based on a simplified DLO energy model, which considers the full system states and all constraints to plan more reliable paths. Then, a closed-loop manipulation scheme is proposed to compensate for the modeling errors and enhance the robustness and accuracy, which incorporates a model predictive controller that real-time adjusts the robot motion based on an adaptive DLO motion model. The key novelty is that our framework can efficiently solve the high-dimensional problem subject to multiple constraints and generalize to various DLOs without elaborate model identifications. Experiments demonstrate that our framework can accomplish considerably more complicated tasks than existing works, with significantly higher efficiency, generalizability, and reliability.
comment: Accepted by IJRR. Project website: https://mingrui-yu.github.io/DLO_planning_2
Real-time Planning of Minimum-time Trajectories for Agile UAV Flight
We address the challenge of real-time planning of minimum-time trajectories over multiple waypoints, onboard multirotor UAVs. Previous works demonstrated that achieving a truly time-optimal trajectory is computationally too demanding to enable frequent replanning during agile flight, especially on less powerful flight computers. Our approach overcomes this stumbling block by utilizing a point-mass model with a novel iterative thrust decomposition algorithm, enabling the UAV to use all of its collective thrust, something previous point-mass approaches could not achieve. The approach enables gravity and drag modeling integration, significantly reducing tracking errors in high-speed trajectories, which is proven through an ablation study. When combined with a new multi-waypoint optimization algorithm, which uses a gradient-based method to converge to optimal velocities in waypoints, the proposed method generates minimum-time multi-waypoint trajectories within milliseconds. The proposed approach, which we provide as open-source package, is validated both in simulation and in real-world, using Nonlinear Model Predictive Control. With accelerations of up to 3.5g and speeds over 100 km/h, trajectories generated by the proposed method yield similar or even smaller tracking errors than the trajectories generated for a full multirotor model.
Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation NeurIPS 2024
Despite significant progress in robotics and embodied AI in recent years, deploying robots for long-horizon tasks remains a great challenge. Majority of prior arts adhere to an open-loop philosophy and lack real-time feedback, leading to error accumulation and undesirable robustness. A handful of approaches have endeavored to establish feedback mechanisms leveraging pixel-level differences or pre-trained visual representations, yet their efficacy and adaptability have been found to be constrained. Inspired by classic closed-loop control systems, we propose CLOVER, a closed-loop visuomotor control framework that incorporates feedback mechanisms to improve adaptive robotic control. CLOVER consists of a text-conditioned video diffusion model for generating visual plans as reference inputs, a measurable embedding space for accurate error quantification, and a feedback-driven controller that refines actions from feedback and initiates replans as needed. Our framework exhibits notable advancement in real-world robotic tasks and achieves state-of-the-art on CALVIN benchmark, improving by 8% over previous open-loop counterparts. Code and checkpoints are maintained at https://github.com/OpenDriveLab/CLOVER.
comment: Accepted at NeurIPS 2024. Code and models: https://github.com/OpenDriveLab/CLOVER
A Central Motor System Inspired Pre-training Reinforcement Learning for Robotic Control
The development of intelligent robots requires control policies that can handle dynamic environments and evolving tasks. Pre-training reinforcement learning has emerged as an effective approach to address these demands by enabling robots to acquire reusable motor skills. However, they often rely on large datasets or expert-designed goal spaces, limiting adaptability. Additionally, these methods need help to generate dynamic and diverse skills in high-dimensional state spaces, reducing their effectiveness for downstream tasks. In this paper, we propose CMS-PRL, a pre-training reinforcement learning method inspired by the Central Motor System (CMS). First, we introduce a fusion reward mechanism that combines the basic motor reward with mutual information reward, promoting the discovery of dynamic skills during pre-training without reliance on external data. Second, we design a skill encoding method inspired by the motor program of the basal ganglia, providing rich and continuous skill instructions during pre-training. Finally, we propose a skill activity function to regulate motor skill activity, enabling the generation of skills with different activity levels, thereby enhancing the robot's flexibility in downstream tasks. We evaluate the model on four types of robots in a challenging set of sparse-reward tasks. Experimental results demonstrate that CMS-PRL generates diverse, reusable motor skills to solve various downstream tasks and outperforms baseline methods, particularly in high-degree-of-freedom robots and complex tasks.
comment: 12 pages; 9 figures
Deep Attention Driven Reinforcement Learning (DAD-RL) for Autonomous Decision-Making in Dynamic Environment
Autonomous Vehicle (AV) decision making in urban environments is inherently challenging due to the dynamic interactions with surrounding vehicles. For safe planning, AV must understand the weightage of various spatiotemporal interactions in a scene. Contemporary works use colossal transformer architectures to encode interactions mainly for trajectory prediction, resulting in increased computational complexity. To address this issue without compromising spatiotemporal understanding and performance, we propose the simple Deep Attention Driven Reinforcement Learning (DADRL) framework, which dynamically assigns and incorporates the significance of surrounding vehicles into the ego's RL driven decision making process. We introduce an AV centric spatiotemporal attention encoding (STAE) mechanism for learning the dynamic interactions with different surrounding vehicles. To understand map and route context, we employ a context encoder to extract features from context maps. The spatiotemporal representations combined with contextual encoding provide a comprehensive state representation. The resulting model is trained using the Soft Actor Critic (SAC) algorithm. We evaluate the proposed framework on the SMARTS urban benchmarking scenarios without traffic signals to demonstrate that DADRL outperforms recent state of the art methods. Furthermore, an ablation study underscores the importance of the context-encoder and spatio temporal attention encoder in achieving superior performance.
comment: 6 pages, 3 figures
RPMArt: Towards Robust Perception and Manipulation for Articulated Objects IROS 2024
Articulated objects are commonly found in daily life. It is essential that robots can exhibit robust perception and manipulation skills for articulated objects in real-world robotic applications. However, existing methods for articulated objects insufficiently address noise in point clouds and struggle to bridge the gap between simulation and reality, thus limiting the practical deployment in real-world scenarios. To tackle these challenges, we propose a framework towards Robust Perception and Manipulation for Articulated Objects (RPMArt), which learns to estimate the articulation parameters and manipulate the articulation part from the noisy point cloud. Our primary contribution is a Robust Articulation Network (RoArtNet) that is able to predict both joint parameters and affordable points robustly by local feature learning and point tuple voting. Moreover, we introduce an articulation-aware classification scheme to enhance its ability for sim-to-real transfer. Finally, with the estimated affordable point and articulation joint constraint, the robot can generate robust actions to manipulate articulated objects. After learning only from synthetic data, RPMArt is able to transfer zero-shot to real-world articulated objects. Experimental results confirm our approach's effectiveness, with our framework achieving state-of-the-art performance in both noise-added simulation and real-world environments. Code, data and more results can be found on the project website at https://r-pmart.github.io.
comment: 8 pages, 7 figures, accepted by 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024), project website at https://r-pmart.github.io
Multiagent Systems
Learning Strategy Representation for Imitation Learning in Multi-Agent Games
The offline datasets for imitation learning (IL) in multi-agent games typically contain player trajectories exhibiting diverse strategies, which necessitate measures to prevent learning algorithms from acquiring undesirable behaviors. Learning representations for these trajectories is an effective approach to depicting the strategies employed by each demonstrator. However, existing learning strategies often require player identification or rely on strong assumptions, which are not appropriate for multi-agent games. Therefore, in this paper, we introduce the Strategy Representation for Imitation Learning (STRIL) framework, which (1) effectively learns strategy representations in multi-agent games, (2) estimates proposed indicators based on these representations, and (3) filters out sub-optimal data using the indicators. STRIL is a plug-in method that can be integrated into existing IL algorithms. We demonstrate the effectiveness of STRIL across competitive multi-agent scenarios, including Two-player Pong, Limit Texas Hold'em, and Connect Four. Our approach successfully acquires strategy representations and indicators, thereby identifying dominant trajectories and significantly enhancing existing IL performance across these environments.
comment: 13 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:2402.18617
Large Language Model-Driven Cross-Domain Orchestration Using Multi-Agent Workflow
We showcase an application that leverages multiple agents, powered by large language models and integrated tools, to collaboratively solve complex network operation tasks across various domains. The tasks include real-time topology retrieval, network optimization using physical models, and fiber switching facilitated by a robotic arm.
Sufficient Conditions on Bipartite Consensus of Weakly Connected Matrix-weighted Networks
Recent advancements in bipartite consensus, a scenario where agents are divided into two disjoint sets with agents in the same set agreeing on a certain value and those in different sets agreeing on opposite or specifically related values, have highlighted its potential applications across various fields. Traditional research typically relies on the presence of a positive-negative spanning tree, which limits the practical applicability of bipartite consensus. This study relaxes that assumption by allowing for weak connectivity within the network, where paths can be weighted by semidefinite matrices. By exploring the algebraic constraints imposed by positive-negative trees and semidefinite paths, we derive sufficient conditions for achieving bipartite consensus. Our theoretical findings are validated through numerical results.
comment: There is a misstatement in Section 3.2 about the condition of the main Theorem, as in "Assumption 2 is a necessary condition". In addition, example in Fig. 2 needs to be adjusted
Tracking and managing deemed abilities
Information about the powers and abilities of acting entities is used to coordinate their actions in societies, either physical or digital. Yet, the commonsensical meaning of an acting entity being deemed able to do something is still missing from the existing specification languages for the web or for multi-agent systems. We advance a general purpose abstract logical account of evidence-based ability. A basic model can be thought of as the ongoing trace of a multi-agent system. Every state records systemic confirmations and disconfirmations of whether an acting entity is able to bring about something. Qualitative inductive reasoning is then used in order to infer what acting entities are deemed able to bring about in the multi-agent system. A temporalised modal language is used to talk about deemed ability, actual agency, and confirmation and disconfirmation of deemed ability. What constitutes a confirmation and a disconfirmation is left to the modeller as in general it depends on the application at hand. So to illustrate the methodology we propose two extended examples, one in practical philosophy, the other in system engineering. We first use a logic of agency and ability to obtain a version of Mele's general practical abilities. Then, we look at the management of abilities in a supervised system.
Geometric Structure and Polynomial-time Algorithm of Game Equilibria
Whether a PTAS (polynomial-time approximation scheme) exists for game equilibria has been an open question, and its absence has indications and consequences in three fields: the practicality of methods in algorithmic game theory, non-stationarity and curse of multiagency in MARL (multi-agent reinforcement learning), and the tractability of PPAD in computational complexity theory. In this paper, we introduce a geometric object called equilibrium bundle, regarding which, first, we formalize perfect equilibria of dynamic games as the zero points of its canonical section, second, we formalize a hybrid iteration of dynamic programming and interior point method as a line search on it, third, we give the existence and oddness theorems of it as an extension of those of Nash equilibria. The line search leads to any perfect equilibrium of any dynamic game, it achieves a weak approximation (approximating to an $\epsilon$-equilibrium) in fully polynomial time, and it achieves a strong approximation (approximating to an $\epsilon$-neighborhood of an actual equilibrium) with dependent time complexity. Our method is an FPTAS (fully PTAS) for the PPAD-complete weak approximation problem of game equilibria, implying PPAD=FP. As intermediate results, we introduce two concepts called unbiased barrier problem and unbiased KKT conditions to make the interior point method to approximate Nash equilibria, and introduce a concept called policy cone to give the sufficient and necessary condition for dynamic programming to converge to perfect equilibria. In experiment, the line search process is animated, and the method is tested on 2000 randomly generated dynamic games where it converges to a perfect equilibrium in every single case.
comment: 28 pages, 5 figures, code and animation are available at https://github.com/shb20tsinghua/PTAS_Game/tree/main
Onboard Ranging-based Relative Localization and Stability for Lightweight Aerial Swarms
Lightweight aerial swarms have potential applications in scenarios where larger drones fail to operate efficiently. The primary foundation for lightweight aerial swarms is efficient relative localization, which enables cooperation and collision avoidance. Computing the real-time position is challenging due to extreme resource constraints. This paper presents an autonomous relative localization technique for lightweight aerial swarms without infrastructure by fusing ultra-wideband wireless distance measurements and the shared state information (e.g., velocity, yaw rate, height) from neighbors. This is the first fully autonomous, tiny, fast, and accurate relative localization scheme implemented on a team of 13 lightweight (33 grams) and resource-constrained (168MHz MCU with 192 KB memory) aerial vehicles. The proposed resource-constrained swarm ranging protocol is scalable, and a surprising theoretical result is discovered: the unobservability poses no issues because the state drift leads to control actions that make the state observable again. By experiment, less than 0.2m position error is achieved at the frequency of 16Hz for as many as 13 drones. The code is open-sourced, and the proposed technique is relevant not only for tiny drones but can be readily applied to many other resource-restricted robots. Video and code can be found at \textnormal{\url{https://shushuai3.github.io/autonomous-swarm/}}.
comment: Project link: https://shushuai3.github.io/autonomous-swarm/
Systems and Control (CS)
Construction of the Sparsest Maximally $r$-Robust Graphs
In recent years, the notion of r-robustness for the communication graph of the network has been introduced to address the challenge of achieving consensus in the presence of misbehaving agents. Higher r-robustness typically implies higher tolerance to malicious information towards achieving resilient consensus, but it also implies more edges for the communication graph. This in turn conflicts with the need to minimize communication due to limited resources in real-world applications (e.g., multi-robot networks). In this paper, our contributions are twofold. (a) We provide the necessary subgraph structures and tight lower bounds on the number of edges required for graphs with a given number of nodes to achieve maximum robustness. (b) We then use the results of (a) to introduce two classes of graphs that maintain maximum robustness with the least number of edges. Our work is validated through a series of simulations.
comment: Accepted and will appear at IEEE CDC 2024
Co-investment with Payoff Sharing Benefit Operators and Users in Network Design
Network-based complex systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users. In this paper, we consider the question of what cooperative mechanisms can benefit both operators and users simultaneously. We address this question in a game theoretical setting, integrating both non-cooperative and cooperative game theory. During the non-cooperative stage, subnetwork decision-makers strategically design their local networks. In the cooperative stage, the co-investment mechanism and the payoff-sharing mechanism are developed to enlarge collective benefits and fairly distribute them. A case study of the Sioux Falls network is conducted to demonstrate the efficiency of the proposed framework. The impact of this interactive network design on environmental sustainability, social welfare and economic efficiency is evaluated, along with an examination of scenarios involving regions with heterogeneous characteristics.
comment: 8 pages, 6 figures
Canonical Correlation Guided Deep Neural Network
Learning representations of two views of data such that the resulting representations are highly linearly correlated is appealing in machine learning. In this paper, we present a canonical correlation guided learning framework, which allows to be realized by deep neural networks (CCDNN), to learn such a correlated representation. It is also a novel merging of multivariate analysis (MVA) and machine learning, which can be viewed as transforming MVA into end-to-end architectures with the aid of neural networks. Unlike the linear canonical correlation analysis (CCA), kernel CCA and deep CCA, in the proposed method, the optimization formulation is not restricted to maximize correlation, instead we make canonical correlation as a constraint, which preserves the correlated representation learning ability and focuses more on the engineering tasks endowed by optimization formulation, such as reconstruction, classification and prediction. Furthermore, to reduce the redundancy induced by correlation, a redundancy filter is designed. We illustrate the performance of CCDNN on various tasks. In experiments on MNIST dataset, the results show that CCDNN has better reconstruction performance in terms of mean squared error and mean absolute error than DCCA and DCCAE. Also, we present the application of the proposed network to industrial fault diagnosis and remaining useful life cases for the classification and prediction tasks accordingly. The proposed method demonstrates superior performance in both tasks when compared to existing methods. Extension of CCDNN to much more deeper with the aid of residual connection is also presented in appendix.
comment: 11 pages, 13 figures
Analytical Construction of CBF-Based Safety Filters for Simultaneous State and Input Constraints
We revisit the problem explored in [1] of guaranteeing satisfaction of multiple simultaneous state constraints applied to a single-input, single-output plant consisting of a chain of n integrators subject to input limitations. For this problem setting, we derive an analytic, easy-to-implement safety filter which respects input limitations and ensures forward-invariance of all state constraints simultaneously. Additionally, we provide a straightforward extension to the multi-input, multi-output chained integrator setting, and provide an analytic safety filter guaranteeing satisfaction of arbitrarily many simultaneous hyperplane constraints on the output vector. Whereas the approach in [1] obtains maximal invariant sets, our approach trades off some degree of conservatism in exchange for a recursive safety filter which is analytic for any arbitrary n >= 1.
comment: To be submitted to the 2025 American Control Conference
Distributed Optimization via Energy Conservation Laws in Dilated Coordinates
Optimizing problems in a distributed manner is critical for systems involving multiple agents with private data. Despite substantial interest, a unified method for analyzing the convergence rates of distributed optimization algorithms is lacking. This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates. Instead of directly analyzing dynamics in the original coordinate system, we establish a conserved quantity, akin to physical energy, in the dilated coordinate system. Consequently, convergence rates can be explicitly expressed in terms of the inverse time-dilation factor. Leveraging this generalized approach, we formulate a novel second-order distributed accelerated gradient flow with a convergence rate of $O\left(1/t^{2-\epsilon}\right)$ in time $t$ for $\epsilon>0$. We then employ a semi second-order symplectic Euler discretization to derive a rate-matching algorithm with a convergence rate of $O\left(1/k^{2-\epsilon}\right)$ in $k$ iterations. To the best of our knowledge, this represents the most favorable convergence rate for any distributed optimization algorithm designed for smooth convex optimization. Its accelerated convergence behavior is benchmarked against various state-of-the-art distributed optimization algorithms on practical, large-scale problems.
comment: 10 pages; (Near) optimal convergence rate
Implicit Euler Discrete-Time Set-Valued Admittance Control for Impact-Contact Force Control
Admittance control is a commonly used strategy for regulating robotic systems, such as quadruped and humanoid robots, allowing them to respond compliantly to contact forces during interactions with their environments. However, it can lead to instability and unsafe behaviors like snapping back and overshooting due to torque saturation from impacts with unknown stiffness environments. This paper introduces a novel admittance controller that ensures stable force control after impacting unknown stiffness environments by leveraging the differentiability of impact-contact forces. The controller is mathematically represented by a differential algebraic inclusion (DAI) comprising two interdependent set-valued loops. The first loop employs set-valued first-order sliding mode control (SMC) to limit input torque post-impact. The second loop utilizes the multivariable super-twisting algorithm (MSTA) to mitigate unstable motion caused by impact forces when interacting with unknown stiffness environments. Implementing this proposed admittance control in digital settings presents challenges due to the interconnected structure of the two set-valued loops, unlike implicit Euler discretization methods for set-valued SMCs. To facilitate implementation, this paper offers a new algorithm for implicit Euler discretization of the DAI. Simulation and experimental results demonstrate that the proposed admittance controller outperforms state-of-the-art methods.
comment: 12 pages, 8 figures
Safe Delay-Adaptive Control of Strict-Feedback Nonlinear Systems with Application in Vehicle Platooning
This paper presents a safe delay-adaptive control for a strict-feedback nonlinear ODE with a delayed actuator, whose dynamic is also a strict-feedback nonlinear ODE and the delay length is unknown. By formulating the delay as a transport PDE, the plant becomes a sandwich configuration consisting of nonlinear ODE-transport PDE-nonlinear ODE, where the transport speed in the PDE is unknown. We propose a predictor-based nonovershooting backstepping transformation to build the nominal safe delay-compensated control, guaranteeing that the output of the distal ODE safely tracks the target trajectory from one side without undershooting. To address the uncertainty in the delay, we incorporate recent delay-adaptive and safe adaptive technologies to build a safe adaptive-delay controller. The adaptive closed-loop system ensures 1) the exact identification of the unknown delay in finite time; 2) the output state stays in the safe region all the time, especially in the original safe region, instead of a subset, after a finite time; 3) all states are bounded, and moreover, they will converge to zero if the target trajectory is identically zero. In the simulation, the proposed control design is verified in the application of safe vehicle platooning. It regulates the spacing between adjacent vehicles to converge to a small distance and avoids collisions by ensuring they do not breach the safe distance at any time, even in the presence of large unknown delays and at a relatively high speed.
State estimation for parallel-connected batteries via inverse dynamic modeling
This paper examines the problem of estimating the states, including state of charge, of battery cells connected in parallel. Previous research highlights the importance of this problem, and presents multiple approaches for solving it. Algorithm scalability and observability analysis can both be challenging, particularly because the underlying pack dynamics are governed by differential algebraic equations. Our work addresses these challenges from a novel perspective that begins by inverting the causality of parallel pack dynamics, which breaks the pack model's underlying algebraic loop. This simplifies observability analysis and observer design significantly, leading to three novel contributions. First, the paper derives mathematical conditions for state observability that apply regardless of the number of battery cells and the order of their individual dynamics. Second, the paper presents an approach for grouping battery cells such that their lumped dynamics are observable. Finally, the paper presents a novel pack state estimator that achieves computational tractability by employing inverse dynamic modeling. We conclude by presenting a Monte Carlo simulation study of this estimator using experimentally-parameterized models of two battery chemistries. The simulation results highlight the computational benefits of both the clustering strategy and inverse dynamics approach for state estimation.
comment: 27 pages, 7 figures
Complete and Near-Optimal Robotic Crack Coverage and Filling in Civil Infrastructure
We present a simultaneous sensor-based inspection and footprint coverage (SIFC) planning and control design with applications to autonomous robotic crack mapping and filling. The main challenge of the SIFC problem lies in the coupling of complete sensing (for mapping) and robotic footprint (for filling) coverage tasks. Initially, we assume known target information (e.g., cracks) and employ classic cell decomposition methods to achieve complete sensing coverage of the workspace and complete robotic footprint coverage using the least-cost route. Subsequently, we generalize the algorithm to handle unknown target information, allowing the robot to scan and incrementally construct the target map online while conducting robotic footprint coverage. The online polynomial-time SIFC planning algorithm minimizes the total robot traveling distance, guarantees complete sensing coverage of the entire workspace, and achieves near-optimal robotic footprint coverage, as demonstrated through experiments. For the demonstrated application, we design coordinated nozzle motion control with the planned robot trajectory to efficiently fill all cracks within the robot's footprint. Experimental results illustrate the algorithm's design, performance, and comparisons. The SIFC algorithm offers a high-efficiency motion planning solution for various robotic applications requiring simultaneous sensing and actuation coverage.
Combining Switching Mechanism with Re-Initialization and Anomaly Detection for Resiliency of Cyber-Physical Systems
Cyber-physical systems (CPS) play a pivotal role in numerous critical real-world applications that have stringent requirements for safety. To enhance the CPS resiliency against attacks, redundancy can be integrated in real-time controller implementations by designing strategies that switch among multiple controllers. However, existing switching strategies typically overlook remediation measures for compromised controllers, opting instead to simply exclude them. Such a solution reduces the CPS redundancy since only a subset of controllers are used. To address this gap, this work proposes a multi-controller switching strategy with periodic re-initialization to remove attacks. Controllers that finish re-initialization can be reused by the switching strategy, preserving the CPS redundancy and resiliency. The proposed switching strategy is designed to ensure that at each switching moment, a controller that has just completed re-initialization is available, minimizing the likelihood of compromise. Additionally, the controller's working period decreases with the number of involved controllers, reducing the controller's exposure time to attacks. An anomaly detector is used to detect CPS attacks during the controller's working period. Upon alarm activation, the current control signal is set to a predefined value, and a switch to an alternative controller occurs at the earliest switching moment. Our switching strategy is shown to be still effective even if the anomaly detector fails to detect (stealthy) attacks.
Sufficient Conditions on Bipartite Consensus of Weakly Connected Matrix-weighted Networks
Recent advancements in bipartite consensus, a scenario where agents are divided into two disjoint sets with agents in the same set agreeing on a certain value and those in different sets agreeing on opposite or specifically related values, have highlighted its potential applications across various fields. Traditional research typically relies on the presence of a positive-negative spanning tree, which limits the practical applicability of bipartite consensus. This study relaxes that assumption by allowing for weak connectivity within the network, where paths can be weighted by semidefinite matrices. By exploring the algebraic constraints imposed by positive-negative trees and semidefinite paths, we derive sufficient conditions for achieving bipartite consensus. Our theoretical findings are validated through numerical results.
comment: There is a misstatement in Section 3.2 about the condition of the main Theorem, as in "Assumption 2 is a necessary condition". In addition, example in Fig. 2 needs to be adjusted
Robust Backstepping Control of a Quadrotor Unmanned Aerial Vehicle Under Colored Noises
Advances in software and hardware technologies have facilitated the production of quadrotor unmanned aerial vehicles (UAVs). Quadrotor UAVs are used in important missions such as search and rescue, counter terrorism, firefighting, surveillance and cargo transportation. While performing these tasks, quadrotors must operate in noisy environments. Therefore, a robust controller design that can control the altitude and attitude of the quadrotor in noisy environments is of great importance. While many researchers focus only on white Gaussian noise in their studies, all colored noises should be considered during quadrotor's operation. In this study, it is aimed to design a robust controller that is resistant to all colored noises. Firstly, a nonlinear model of the quadrotor was created with MATLAB. Then, a backstepping control design that is resistant to colored noises was realized. The designed backstepping controller was tested under Gaussian white noise, pink noise, brown noise, blue noise and purple noise. PID and Lyapunov-based controller designs were also carried out and their time responses (rise time, overshoot, settling time) were compared with those of backstepping controller. When the values obtained was examined, it was proven that the proposed backstepping controller had the least overshoot and shortest settling time under all noise types.
comment: 18 pages, 9 figures
Prescribed-time Cooperative Output Regulation of Linear Heterogeneous Multi-agent Systems
A finite-time protocol for a multi-agent systems (MASs) can guarantee the convergence of every agent in a finite time interval in contrast to the asymptotic convergence, but the settling time depends on the initial condition and design parameters and is inconsistent across the agents. In this paper, we study the prescribed-time cooperative output regulation (PTCOR) problem for a class of linear heterogeneous MASs under a directed communication graph, where the settling time of every agent can be specified a priori and thus consistent. As a special case of PTCOR, the necessary and sufficient condition for prescribed-time output regulation of an individual system is first discussed. Then, the PTCOR problem is converted into two cascaded subsystem, where the first one composed of distributed estimate errors and local estimate errors and the second one is for local tracking errors. The criterion for prescribed-time stabilization of the cascaded system is proposed and is found to be different from that of traditional asymptotic stabilization of a cascaded system. Under the criterion and sufficient condition, the general PTCOR problem is studied in two scenarios including state feedback control and measurement output feedback control. In particular, a distributed prescribed-time observer for each subsystem is explicitly constructed to estimate the exosystem's state. Based on the observer, a distributed controller is proposed to achieve convergence of the regulated output to zero within a prescribed-time.
comment: None
A Parameterized Nonlinear Magnetic Equivalent Circuit for Design and Fast Analysis of Radial Flux Magnetic Gears
Magnetic gears offer advantages over mechanical gears, including contactless power transfer, but require robust analysis tools for optimization and commercialization. This study proposes a rapid and accurate 2D nonlinear magnetic equivalent circuit (MEC) model for radial flux magnetic gears (RFMG). The model, featuring a parameterized gear geometry and adjustable flux tube distribution, accommodates nonlinear effects like magnetic saturation while maintaining quick simulation times. Comparison with a nonlinear finite element analysis (FEA) model demonstrates the MEC's accuracy in torque and flux density predictions across diverse designs. Additionally, a parametric optimization study of 140,000 designs confirms the MEC's high accuracy, achieving close agreement with FEA torque predictions, with simulations running up to 100 times faster. Finally, the MEC shows good agreement with 2D FEA for a prototype RFMG.
Systems and Control (EESS)
Construction of the Sparsest Maximally $r$-Robust Graphs
In recent years, the notion of r-robustness for the communication graph of the network has been introduced to address the challenge of achieving consensus in the presence of misbehaving agents. Higher r-robustness typically implies higher tolerance to malicious information towards achieving resilient consensus, but it also implies more edges for the communication graph. This in turn conflicts with the need to minimize communication due to limited resources in real-world applications (e.g., multi-robot networks). In this paper, our contributions are twofold. (a) We provide the necessary subgraph structures and tight lower bounds on the number of edges required for graphs with a given number of nodes to achieve maximum robustness. (b) We then use the results of (a) to introduce two classes of graphs that maintain maximum robustness with the least number of edges. Our work is validated through a series of simulations.
comment: Accepted and will appear at IEEE CDC 2024
Co-investment with Payoff Sharing Benefit Operators and Users in Network Design
Network-based complex systems are inherently interconnected, with the design and performance of subnetworks being interdependent. However, the decisions of self-interested operators may lead to suboptimal outcomes for users. In this paper, we consider the question of what cooperative mechanisms can benefit both operators and users simultaneously. We address this question in a game theoretical setting, integrating both non-cooperative and cooperative game theory. During the non-cooperative stage, subnetwork decision-makers strategically design their local networks. In the cooperative stage, the co-investment mechanism and the payoff-sharing mechanism are developed to enlarge collective benefits and fairly distribute them. A case study of the Sioux Falls network is conducted to demonstrate the efficiency of the proposed framework. The impact of this interactive network design on environmental sustainability, social welfare and economic efficiency is evaluated, along with an examination of scenarios involving regions with heterogeneous characteristics.
comment: 8 pages, 6 figures
Canonical Correlation Guided Deep Neural Network
Learning representations of two views of data such that the resulting representations are highly linearly correlated is appealing in machine learning. In this paper, we present a canonical correlation guided learning framework, which allows to be realized by deep neural networks (CCDNN), to learn such a correlated representation. It is also a novel merging of multivariate analysis (MVA) and machine learning, which can be viewed as transforming MVA into end-to-end architectures with the aid of neural networks. Unlike the linear canonical correlation analysis (CCA), kernel CCA and deep CCA, in the proposed method, the optimization formulation is not restricted to maximize correlation, instead we make canonical correlation as a constraint, which preserves the correlated representation learning ability and focuses more on the engineering tasks endowed by optimization formulation, such as reconstruction, classification and prediction. Furthermore, to reduce the redundancy induced by correlation, a redundancy filter is designed. We illustrate the performance of CCDNN on various tasks. In experiments on MNIST dataset, the results show that CCDNN has better reconstruction performance in terms of mean squared error and mean absolute error than DCCA and DCCAE. Also, we present the application of the proposed network to industrial fault diagnosis and remaining useful life cases for the classification and prediction tasks accordingly. The proposed method demonstrates superior performance in both tasks when compared to existing methods. Extension of CCDNN to much more deeper with the aid of residual connection is also presented in appendix.
comment: 11 pages, 13 figures
Analytical Construction of CBF-Based Safety Filters for Simultaneous State and Input Constraints
We revisit the problem explored in [1] of guaranteeing satisfaction of multiple simultaneous state constraints applied to a single-input, single-output plant consisting of a chain of n integrators subject to input limitations. For this problem setting, we derive an analytic, easy-to-implement safety filter which respects input limitations and ensures forward-invariance of all state constraints simultaneously. Additionally, we provide a straightforward extension to the multi-input, multi-output chained integrator setting, and provide an analytic safety filter guaranteeing satisfaction of arbitrarily many simultaneous hyperplane constraints on the output vector. Whereas the approach in [1] obtains maximal invariant sets, our approach trades off some degree of conservatism in exchange for a recursive safety filter which is analytic for any arbitrary n >= 1.
comment: To be submitted to the 2025 American Control Conference
Distributed Optimization via Energy Conservation Laws in Dilated Coordinates
Optimizing problems in a distributed manner is critical for systems involving multiple agents with private data. Despite substantial interest, a unified method for analyzing the convergence rates of distributed optimization algorithms is lacking. This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates. Instead of directly analyzing dynamics in the original coordinate system, we establish a conserved quantity, akin to physical energy, in the dilated coordinate system. Consequently, convergence rates can be explicitly expressed in terms of the inverse time-dilation factor. Leveraging this generalized approach, we formulate a novel second-order distributed accelerated gradient flow with a convergence rate of $O\left(1/t^{2-\epsilon}\right)$ in time $t$ for $\epsilon>0$. We then employ a semi second-order symplectic Euler discretization to derive a rate-matching algorithm with a convergence rate of $O\left(1/k^{2-\epsilon}\right)$ in $k$ iterations. To the best of our knowledge, this represents the most favorable convergence rate for any distributed optimization algorithm designed for smooth convex optimization. Its accelerated convergence behavior is benchmarked against various state-of-the-art distributed optimization algorithms on practical, large-scale problems.
comment: 10 pages; (Near) optimal convergence rate
Implicit Euler Discrete-Time Set-Valued Admittance Control for Impact-Contact Force Control
Admittance control is a commonly used strategy for regulating robotic systems, such as quadruped and humanoid robots, allowing them to respond compliantly to contact forces during interactions with their environments. However, it can lead to instability and unsafe behaviors like snapping back and overshooting due to torque saturation from impacts with unknown stiffness environments. This paper introduces a novel admittance controller that ensures stable force control after impacting unknown stiffness environments by leveraging the differentiability of impact-contact forces. The controller is mathematically represented by a differential algebraic inclusion (DAI) comprising two interdependent set-valued loops. The first loop employs set-valued first-order sliding mode control (SMC) to limit input torque post-impact. The second loop utilizes the multivariable super-twisting algorithm (MSTA) to mitigate unstable motion caused by impact forces when interacting with unknown stiffness environments. Implementing this proposed admittance control in digital settings presents challenges due to the interconnected structure of the two set-valued loops, unlike implicit Euler discretization methods for set-valued SMCs. To facilitate implementation, this paper offers a new algorithm for implicit Euler discretization of the DAI. Simulation and experimental results demonstrate that the proposed admittance controller outperforms state-of-the-art methods.
comment: 12 pages, 8 figures
Safe Delay-Adaptive Control of Strict-Feedback Nonlinear Systems with Application in Vehicle Platooning
This paper presents a safe delay-adaptive control for a strict-feedback nonlinear ODE with a delayed actuator, whose dynamic is also a strict-feedback nonlinear ODE and the delay length is unknown. By formulating the delay as a transport PDE, the plant becomes a sandwich configuration consisting of nonlinear ODE-transport PDE-nonlinear ODE, where the transport speed in the PDE is unknown. We propose a predictor-based nonovershooting backstepping transformation to build the nominal safe delay-compensated control, guaranteeing that the output of the distal ODE safely tracks the target trajectory from one side without undershooting. To address the uncertainty in the delay, we incorporate recent delay-adaptive and safe adaptive technologies to build a safe adaptive-delay controller. The adaptive closed-loop system ensures 1) the exact identification of the unknown delay in finite time; 2) the output state stays in the safe region all the time, especially in the original safe region, instead of a subset, after a finite time; 3) all states are bounded, and moreover, they will converge to zero if the target trajectory is identically zero. In the simulation, the proposed control design is verified in the application of safe vehicle platooning. It regulates the spacing between adjacent vehicles to converge to a small distance and avoids collisions by ensuring they do not breach the safe distance at any time, even in the presence of large unknown delays and at a relatively high speed.
State estimation for parallel-connected batteries via inverse dynamic modeling
This paper examines the problem of estimating the states, including state of charge, of battery cells connected in parallel. Previous research highlights the importance of this problem, and presents multiple approaches for solving it. Algorithm scalability and observability analysis can both be challenging, particularly because the underlying pack dynamics are governed by differential algebraic equations. Our work addresses these challenges from a novel perspective that begins by inverting the causality of parallel pack dynamics, which breaks the pack model's underlying algebraic loop. This simplifies observability analysis and observer design significantly, leading to three novel contributions. First, the paper derives mathematical conditions for state observability that apply regardless of the number of battery cells and the order of their individual dynamics. Second, the paper presents an approach for grouping battery cells such that their lumped dynamics are observable. Finally, the paper presents a novel pack state estimator that achieves computational tractability by employing inverse dynamic modeling. We conclude by presenting a Monte Carlo simulation study of this estimator using experimentally-parameterized models of two battery chemistries. The simulation results highlight the computational benefits of both the clustering strategy and inverse dynamics approach for state estimation.
comment: 27 pages, 7 figures
Complete and Near-Optimal Robotic Crack Coverage and Filling in Civil Infrastructure
We present a simultaneous sensor-based inspection and footprint coverage (SIFC) planning and control design with applications to autonomous robotic crack mapping and filling. The main challenge of the SIFC problem lies in the coupling of complete sensing (for mapping) and robotic footprint (for filling) coverage tasks. Initially, we assume known target information (e.g., cracks) and employ classic cell decomposition methods to achieve complete sensing coverage of the workspace and complete robotic footprint coverage using the least-cost route. Subsequently, we generalize the algorithm to handle unknown target information, allowing the robot to scan and incrementally construct the target map online while conducting robotic footprint coverage. The online polynomial-time SIFC planning algorithm minimizes the total robot traveling distance, guarantees complete sensing coverage of the entire workspace, and achieves near-optimal robotic footprint coverage, as demonstrated through experiments. For the demonstrated application, we design coordinated nozzle motion control with the planned robot trajectory to efficiently fill all cracks within the robot's footprint. Experimental results illustrate the algorithm's design, performance, and comparisons. The SIFC algorithm offers a high-efficiency motion planning solution for various robotic applications requiring simultaneous sensing and actuation coverage.
Combining Switching Mechanism with Re-Initialization and Anomaly Detection for Resiliency of Cyber-Physical Systems
Cyber-physical systems (CPS) play a pivotal role in numerous critical real-world applications that have stringent requirements for safety. To enhance the CPS resiliency against attacks, redundancy can be integrated in real-time controller implementations by designing strategies that switch among multiple controllers. However, existing switching strategies typically overlook remediation measures for compromised controllers, opting instead to simply exclude them. Such a solution reduces the CPS redundancy since only a subset of controllers are used. To address this gap, this work proposes a multi-controller switching strategy with periodic re-initialization to remove attacks. Controllers that finish re-initialization can be reused by the switching strategy, preserving the CPS redundancy and resiliency. The proposed switching strategy is designed to ensure that at each switching moment, a controller that has just completed re-initialization is available, minimizing the likelihood of compromise. Additionally, the controller's working period decreases with the number of involved controllers, reducing the controller's exposure time to attacks. An anomaly detector is used to detect CPS attacks during the controller's working period. Upon alarm activation, the current control signal is set to a predefined value, and a switch to an alternative controller occurs at the earliest switching moment. Our switching strategy is shown to be still effective even if the anomaly detector fails to detect (stealthy) attacks.
Sufficient Conditions on Bipartite Consensus of Weakly Connected Matrix-weighted Networks
Recent advancements in bipartite consensus, a scenario where agents are divided into two disjoint sets with agents in the same set agreeing on a certain value and those in different sets agreeing on opposite or specifically related values, have highlighted its potential applications across various fields. Traditional research typically relies on the presence of a positive-negative spanning tree, which limits the practical applicability of bipartite consensus. This study relaxes that assumption by allowing for weak connectivity within the network, where paths can be weighted by semidefinite matrices. By exploring the algebraic constraints imposed by positive-negative trees and semidefinite paths, we derive sufficient conditions for achieving bipartite consensus. Our theoretical findings are validated through numerical results.
comment: There is a misstatement in Section 3.2 about the condition of the main Theorem, as in "Assumption 2 is a necessary condition". In addition, example in Fig. 2 needs to be adjusted
Robust Backstepping Control of a Quadrotor Unmanned Aerial Vehicle Under Colored Noises
Advances in software and hardware technologies have facilitated the production of quadrotor unmanned aerial vehicles (UAVs). Quadrotor UAVs are used in important missions such as search and rescue, counter terrorism, firefighting, surveillance and cargo transportation. While performing these tasks, quadrotors must operate in noisy environments. Therefore, a robust controller design that can control the altitude and attitude of the quadrotor in noisy environments is of great importance. While many researchers focus only on white Gaussian noise in their studies, all colored noises should be considered during quadrotor's operation. In this study, it is aimed to design a robust controller that is resistant to all colored noises. Firstly, a nonlinear model of the quadrotor was created with MATLAB. Then, a backstepping control design that is resistant to colored noises was realized. The designed backstepping controller was tested under Gaussian white noise, pink noise, brown noise, blue noise and purple noise. PID and Lyapunov-based controller designs were also carried out and their time responses (rise time, overshoot, settling time) were compared with those of backstepping controller. When the values obtained was examined, it was proven that the proposed backstepping controller had the least overshoot and shortest settling time under all noise types.
comment: 18 pages, 9 figures
Prescribed-time Cooperative Output Regulation of Linear Heterogeneous Multi-agent Systems
A finite-time protocol for a multi-agent systems (MASs) can guarantee the convergence of every agent in a finite time interval in contrast to the asymptotic convergence, but the settling time depends on the initial condition and design parameters and is inconsistent across the agents. In this paper, we study the prescribed-time cooperative output regulation (PTCOR) problem for a class of linear heterogeneous MASs under a directed communication graph, where the settling time of every agent can be specified a priori and thus consistent. As a special case of PTCOR, the necessary and sufficient condition for prescribed-time output regulation of an individual system is first discussed. Then, the PTCOR problem is converted into two cascaded subsystem, where the first one composed of distributed estimate errors and local estimate errors and the second one is for local tracking errors. The criterion for prescribed-time stabilization of the cascaded system is proposed and is found to be different from that of traditional asymptotic stabilization of a cascaded system. Under the criterion and sufficient condition, the general PTCOR problem is studied in two scenarios including state feedback control and measurement output feedback control. In particular, a distributed prescribed-time observer for each subsystem is explicitly constructed to estimate the exosystem's state. Based on the observer, a distributed controller is proposed to achieve convergence of the regulated output to zero within a prescribed-time.
comment: None
A Parameterized Nonlinear Magnetic Equivalent Circuit for Design and Fast Analysis of Radial Flux Magnetic Gears
Magnetic gears offer advantages over mechanical gears, including contactless power transfer, but require robust analysis tools for optimization and commercialization. This study proposes a rapid and accurate 2D nonlinear magnetic equivalent circuit (MEC) model for radial flux magnetic gears (RFMG). The model, featuring a parameterized gear geometry and adjustable flux tube distribution, accommodates nonlinear effects like magnetic saturation while maintaining quick simulation times. Comparison with a nonlinear finite element analysis (FEA) model demonstrates the MEC's accuracy in torque and flux density predictions across diverse designs. Additionally, a parametric optimization study of 140,000 designs confirms the MEC's high accuracy, achieving close agreement with FEA torque predictions, with simulations running up to 100 times faster. Finally, the MEC shows good agreement with 2D FEA for a prototype RFMG.
Robotics
An Interactive Hands-Free Controller for a Riding Ballbot to Enable Simple Shared Control Tasks
Our team developed a riding ballbot (called PURE) that is dynamically stable, omnidirectional, and driven by lean-to-steer control. A hands-free admittance control scheme (HACS) was previously integrated to allow riders with different torso functions to control the robot's movements via torso leaning and twisting. Such an interface requires motor coordination skills and could result in collisions with obstacles due to low proficiency. Hence, a shared controller (SC) that limits the speed of PURE could be helpful to ensure the safety of riders. However, the self-balancing dynamics of PURE could result in a weak control authority of its motion, in which the torso motion of the rider could easily result in poor tracking of the command speed dictated by the shared controller. Thus, we proposed an interactive hands-free admittance control scheme (iHACS), which added two modules to HACS to improve the speed-tracking performance of PURE: control gain personalization module and interaction compensation module. Human riding tests of simple tasks, idle-keeping and speed-limiting, were conducted to compare the performance of HACS and iHACS. Two manual wheelchair users and two able-bodied individuals participated in this study. They were instructed to use "adversarial" torso motions that would tax the SC's ability to keep the ballbot idling or below a set speed. In the idle-keeping tasks, iHACS demonstrated minimal translational motion and low command speed tracking RMSE, even with significant torso lean angles. During the speed-limiting task with command speed saturated at 0.5 m/s, the system achieved an average maximum speed of 1.1 m/s with iHACS, compared with that of over 1.9 m/s with HACS. These results suggest that iHACS can enhance PURE's control authority over the rider, which enables PURE to provide physical interactions back to the rider and results in a collaborative rider-robot synergy.
Optimization-based Task and Motion Planning under Signal Temporal Logic Specifications using Logic Network Flow
This paper proposes an optimization-based task and motion planning framework, named ``Logic Network Flow", to integrate signal temporal logic (STL) specifications into efficient mixed-binary linear programmings. In this framework, temporal predicates are encoded as polyhedron constraints on each edge of the network flow, instead of as constraints between the nodes as in the traditional Logic Tree formulation. Synthesized with Dynamic Network Flows, Logic Network Flows render a tighter convex relaxation compared to Logic Trees derived from these STL specifications. Our formulation is evaluated on several multi-robot motion planning case studies. Empirical results demonstrate that our formulation outperforms Logic Tree formulation in terms of computation time for several planning problems. As the problem size scales up, our method still discovers better lower and upper bounds by exploring fewer number of nodes during the branch-and-bound process, although this comes at the cost of increased computational load for each node when exploring branches.
Signal Temporal Logic Planning with Time-Varying Robustness
This letter aims to generate a continuous-time trajectory consisting of piecewise B\'ezier curves that satisfy signal temporal logic (STL) specifications with piecewise time-varying robustness. Our time-varying robustness is less conservative than the real-valued robustness, which enables more effective tracking in practical applications. Specifically, our continuous-time trajectories account for dynamic feasibility, leading to smaller tracking errors and ensuring that the STL specifications can be met by the tracking trajectory. Comparative experiments demonstrate the efficiency and effectiveness of the proposed approach.
S-RRT*-based Obstacle Avoidance Autonomous Motion Planner for Continuum-rigid Manipulator
Continuum robots are compact and flexible, making them suitable for use in the industries and in medical surgeries. Rapidly-exploring random trees (RRT) are a highly efficient path planning method, and its variant, S-RRT, can generate smooth feasible paths for the end-effector. By combining RRT with inverse instantaneous kinematics (IIK), complete motion planning for the continuum arm can be achieved. Due to the high degrees of freedom of continuum arms, the null space in IIK can be utilized for obstacle avoidance. In this work, we propose a novel approach that uses the S-RRT* algorithm to create paths for the continuum-rigid manipulator. By employing IIK and null space techniques, continuous joint configurations are generated that not only track the path but also enable obstacle avoidance. Simulation results demonstrate that our method effectively handles motion planning and obstacle avoidance while generating high-quality end-effector paths in complex environments. Furthermore, compared to similar IIK methods, our approach exhibits superior computation time.
Robust Proximity Operations using Probabilistic Markov Models ICRA 2025
A Markov decision process-based state switching is devised, implemented, and analyzed for proximity operations of various autonomous vehicles. The framework contains a pose estimator along with a multi-state guidance algorithm. The unified pose estimator leverages the extended Kalman filter for the fusion of measurements from rate gyroscopes, monocular vision, and ultra-wideband radar sensors. It is also equipped with Mahalonobis distance-based outlier rejection and under-weighting of measurements for robust performance. The use of probabilistic Markov models to transition between various guidance modes is proposed to enable robust and efficient proximity operations. Finally, the framework is validated through an experimental analysis of the docking of two small satellites and the precision landing of an aerial vehicle.
comment: This work has been submitted to the IEEE ICRA 2025 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Accompanying video : https://youtu.be/8-fetyf_SrM. arXiv admin note: text overlap with arXiv:2409.09665
UniCal: Unified Neural Sensor Calibration ECCV 2024
Self-driving vehicles (SDVs) require accurate calibration of LiDARs and cameras to fuse sensor data accurately for autonomy. Traditional calibration methods typically leverage fiducials captured in a controlled and structured scene and compute correspondences to optimize over. These approaches are costly and require substantial infrastructure and operations, making it challenging to scale for vehicle fleets. In this work, we propose UniCal, a unified framework for effortlessly calibrating SDVs equipped with multiple LiDARs and cameras. Our approach is built upon a differentiable scene representation capable of rendering multi-view geometrically and photometrically consistent sensor observations. We jointly learn the sensor calibration and the underlying scene representation through differentiable volume rendering, utilizing outdoor sensor data without the need for specific calibration fiducials. This "drive-and-calibrate" approach significantly reduces costs and operational overhead compared to existing calibration systems, enabling efficient calibration for large SDV fleets at scale. To ensure geometric consistency across observations from different sensors, we introduce a novel surface alignment loss that combines feature-based registration with neural rendering. Comprehensive evaluations on multiple datasets demonstrate that UniCal outperforms or matches the accuracy of existing calibration approaches while being more efficient, demonstrating the value of UniCal for scalable calibration.
comment: ECCV 2024. Project page: https://waabi.ai/unical/
Towards Super-Nominal Payload Handling: Inverse Dynamics Analysis for Multi-Skill Robotic Manipulation ICRA
Motion planning for articulated robots has traditionally been governed by algorithms that operate within manufacturer-defined payload limits. Our empirical analysis of the Franka Emika Panda robot demonstrates that this approach unnecessarily restricts the robot's dynamically-reachable task space. These results establish an expanded operational envelope for such robots, showing that they can handle payloads of more than twice their rated capacity. Additionally, our preliminary findings indicate that integrating non-prehensile motion primitives with grasping-based manipulation has the potential to further increase the success rates of manipulation tasks involving payloads exceeding nominal limits.
comment: Accepted as an extended abstract to ICRA@40
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Vision-and-Language Navigation (VLN) tasks require an agent to follow textual instructions to navigate through 3D environments. Traditional approaches use supervised learning methods, relying heavily on domain-specific datasets to train VLN models. Recent methods try to utilize closed-source large language models (LLMs) like GPT-4 to solve VLN tasks in zero-shot manners, but face challenges related to expensive token costs and potential data breaches in real-world applications. In this work, we introduce Open-Nav, a novel study that explores open-source LLMs for zero-shot VLN in the continuous environment. Open-Nav employs a spatial-temporal chain-of-thought (CoT) reasoning approach to break down tasks into instruction comprehension, progress estimation, and decision-making. It enhances scene perceptions with fine-grained object and spatial knowledge to improve LLM's reasoning in navigation. Our extensive experiments in both simulated and real-world environments demonstrate that Open-Nav achieves competitive performance compared to using closed-source LLMs.
Excavating in the Wild: The GOOSE-Ex Dataset for Semantic Segmentation
The successful deployment of deep learning-based techniques for autonomous systems is highly dependent on the data availability for the respective system in its deployment environment. Especially for unstructured outdoor environments, very few datasets exist for even fewer robotic platforms and scenarios. In an earlier work, we presented the German Outdoor and Offroad Dataset (GOOSE) framework along with 10000 multimodal frames from an offroad vehicle to enhance the perception capabilities in unstructured environments. In this work, we address the generalizability of the GOOSE framework. To accomplish this, we open-source the GOOSE-Ex dataset, which contains additional 5000 labeled multimodal frames from various completely different environments, recorded on a robotic excavator and a quadruped platform. We perform a comprehensive analysis of the semantic segmentation performance on different platforms and sensor modalities in unseen environments. In addition, we demonstrate how the combined datasets can be utilized for different downstream applications or competitions such as offroad navigation, object manipulation or scene completion. The dataset, its platform documentation and pre-trained state-of-the-art models for offroad perception will be made available on https://goose-dataset.de/. \
comment: Submitted to IEEE for review
A POMDP-based hierarchical planning framework for manipulation under pose uncertainty
Robots often face challenges in domestic environments where visual feedback is ineffective, such as retrieving objects obstructed by occlusions or finding a light switch in the dark. In these cases, utilizing contacts to localize the target object can be effective. We propose an online planning framework using binary contact signals for manipulation tasks with pose uncertainty, formulated as a Partially Observable Markov Decision Process (POMDP). Naively representing the belief as a particle set makes planning infeasible due to the large uncertainties in domestic settings, as identifying the best sequence of actions requires rolling out thousands of actions across millions of particles, taking significant compute time. To address this, we propose a hierarchical belief representation. Initially, we represent the uncertainty coarsely in a 3D volumetric space. Policies that refine uncertainty in this space are computed and executed, and once uncertainty is sufficiently reduced, the problem is translated back into the particle space for further refinement before task completion. We utilize a closed-loop planning and execution framework with a heuristic-search-based anytime solver that computes partial policies within a limited time budget. The performance of the framework is demonstrated both in real world and in simulation on the high-precision task of inserting a plug into a port using a UR10e manipulator, resolving positional uncertainties up to 50 centimeters and angular uncertainties close to $2\pi$. Experimental results highlight the framework's effectiveness, achieving a 93\% success rate in the real world and over 50\% improvement in solution quality compared to greedy baselines, significantly accelerating planning and enabling real-time solutions for complex problems.
comment: Under review (2025 IEEE International Conference on Robotics & Automation)
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning the parameters of a dynamical system model. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a novel neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Networks (ESNs) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
comment: 21 pages, 9 figures
Transparency evaluation for the Kinematic Design of the Harnesses through Human-Exoskeleton Interaction Modeling
Lower Limb Exoskeletons (LLEs) are wearable robots that provide mechanical power to the user. Human-exoskeleton (HE) connections must preserve the user's natural behavior during the interaction, avoiding undesired forces. Therefore, numerous works focus on their minimization. Given the inherent complications of repeatedly prototyping and experimentally testing a device, modeling the exoskeleton and its physical interaction with the user emerges as a valuable approach for assessing the design effects. This paper proposes a novel method to compare different exoskeleton configurations with a flexible simulation tool. This approach contemplates simulating the dynamics of the device, including its interaction with the wearer, to evaluate multiple connection mechanism designs along with the kinematics and actuation of the LLE. This evaluation is based on the minimization of the interaction wrenches through an optimization process that includes the impedance parameters at the interfaces as optimization variables and the similarity of the LLE's joint variables trajectories with the motion of the wearer's articulations. Exploratory tests are conducted using the Wearable Walker LLE in different configurations and measuring the interaction forces. Experimental data are then compared to the optimization outcomes, proving that the proposed method provides contact wrench estimations consistent with the collected measurements and previous outcomes from the literature. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Royal Reveals: LiDAR Mapping of Kronborg Castle, Echoes of Hamlet's Halls
This paper presents a large scale dataset from a meticulous 360-degree LiDAR (Light Detection and Ranging) scan conducted on Kronborg Castle, a renowned Renaissance fortress located in Elsinore (Helsing{\o}r), Denmark, famously associated with Shakespeare's "Hamlet." Utilising a vertical mounted, gimbal stabilised, 16 channel, 360-degree Velodyne VLP-16 LiDAR scanner, paired with an Intel RealSense L515 depth camera. This research offers an unparalleled digital representation of the castle's intricate architectural details and structural nuances, enabling fellow researchers to conduct experiments utilising the data for SLAM (Simultaneous Localisation and Mapping) as well as floorplan generation.
comment: 4 pages, 4 figures, 3 tables
OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph
In everyday life, frequently used objects like cups often have unfixed positions and multiple instances within the same category, and their carriers frequently change as well. As a result, it becomes challenging for a robot to efficiently navigate to a specific instance. To tackle this challenge, the robot must capture and update scene changes and plans continuously. However, current object navigation approaches primarily focus on semantic-level and lack the ability to dynamically update scene representation. This paper captures the relationships between frequently used objects and their static carriers. It constructs an open-vocabulary Carrier-Relationship Scene Graph (CRSG) and updates the carrying status during robot navigation to reflect the dynamic changes of the scene. Based on the CRSG, we further propose an instance navigation strategy that models the navigation process as a Markov Decision Process. At each step, decisions are informed by Large Language Model's commonsense knowledge and visual-language feature similarity. We designed a series of long-sequence navigation tasks for frequently used everyday items in the Habitat simulator. The results demonstrate that by updating the CRSG, the robot can efficiently navigate to moved targets. Additionally, we deployed our algorithm on a real robot and validated its practical effectiveness.
comment: Project website: https://openobject-nav.github.io/
Optimum Configuration for Hovering n-Quadrotors carrying a Slung Payload SC
This work proposes a strategy for organising quadrotors around a payload to enable hovering without external stimuli, together with a MATLAB software for modelling the dynamics of a quadrotor-payload system. Based on geometric concepts, the proposed design keeps the payload and system centre of mass aligned. Hovering tests that are successful confirm the method's efficiency. Moreover, the algorithm is improved to take thrust capacities and propeller distances into account, calculating the minimum number of quadrotors needed for hovering. The algorithm's effectiveness is demonstrated by numerical examples, which reveal that larger quadrotors may require fewer units while smaller ones give greater flexibility. Our code can be found at: \href{https://github.com/Hosnooo/Swarm-Slung-Payload}{https://github.com/Hosnooo/Swarm-Slung-Payload}
comment: accepted for publication at AIAA SCITECH 2025
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Learning visuomotor policy for multi-task robotic manipulation has been a long-standing challenge for the robotics community. The difficulty lies in the diversity of action space: typically, a goal can be accomplished in multiple ways, resulting in a multimodal action distribution for a single task. The complexity of action distribution escalates as the number of tasks increases. In this work, we propose \textbf{Discrete Policy}, a robot learning method for training universal agents capable of multi-task manipulation skills. Discrete Policy employs vector quantization to map action sequences into a discrete latent space, facilitating the learning of task-specific codes. These codes are then reconstructed into the action space conditioned on observations and language instruction. We evaluate our method on both simulation and multiple real-world embodiments, including both single-arm and bimanual robot settings. We demonstrate that our proposed Discrete Policy outperforms a well-established Diffusion Policy baseline and many state-of-the-art approaches, including ACT, Octo, and OpenVLA. For example, in a real-world multi-task training setting with five tasks, Discrete Policy achieves an average success rate that is 26\% higher than Diffusion Policy and 15\% higher than OpenVLA. As the number of tasks increases to 12, the performance gap between Discrete Policy and Diffusion Policy widens to 32.5\%, further showcasing the advantages of our approach. Our work empirically demonstrates that learning multi-task policies within the latent space is a vital step toward achieving general-purpose agents.
Automatic Gain Tuning for Humanoid Robots Walking Architectures Using Gradient-Free Optimization Techniques
Developing sophisticated control architectures has endowed robots, particularly humanoid robots, with numerous capabilities. However, tuning these architectures remains a challenging and time-consuming task that requires expert intervention. In this work, we propose a methodology to automatically tune the gains of all layers of a hierarchical control architecture for walking humanoids. We tested our methodology by employing different gradient-free optimization methods: Genetic Algorithm (GA), Covariance Matrix Adaptation Evolution Strategy (CMA-ES), Evolution Strategy (ES), and Differential Evolution (DE). We validated the parameter found both in simulation and on the real ergoCub humanoid robot. Our results show that GA achieves the fastest convergence (10 x 10^3 function evaluations vs 25 x 10^3 needed by the other algorithms) and 100% success rate in completing the task both in simulation and when transferred on the real robotic platform. These findings highlight the potential of our proposed method to automate the tuning process, reducing the need for manual intervention.
Pseudo-kinematic trajectory control of tracked vehicles
Tracked vehicles are used in complex scenarios, where motion planning and navigation can be very complex. They have complex dynamics, with many parameters that are difficult to identify and that change significantly based on the operating conditions. We propose a simple pseudo-kinematic model, where the intricate dynamic effects underlying the vehicle's motion are captured in a small set of velocity-dependent parameters. This choice enables the development of a Lyapunov-based trajectory controller with guaranteed performance and small computation time. We demonstrate the correctness of our approach with both simulation and experimental data.
Explaining Explaining
Explanation is key to people having confidence in high-stakes AI systems. However, machine-learning-based systems -- which account for almost all current AI -- can't explain because they are usually black boxes. The explainable AI (XAI) movement hedges this problem by redefining "explanation". The human-centered explainable AI (HCXAI) movement identifies the explanation-oriented needs of users but can't fulfill them because of its commitment to machine learning. In order to achieve the kinds of explanations needed by real people operating in critical domains, we must rethink how to approach AI. We describe a hybrid approach to developing cognitive agents that uses a knowledge-based infrastructure supplemented by data obtained through machine learning when applicable. These agents will serve as assistants to humans who will bear ultimate responsibility for the decisions and actions of the human-robot team. We illustrate the explanatory potential of such agents using the under-the-hood panels of a demonstration system in which a team of simulated robots collaborate on a search task assigned by a human.
Learning Occlusion-aware Decision-making from Agent Interaction via Active Perception
Occlusion-aware decision-making is essential in autonomous driving due to the high uncertainty of various occlusions. Recent occlusion-aware decision-making methods encounter issues such as high computational complexity, scenario scalability challenges, or reliance on limited expert data. Benefiting from automatically generating data by exploration randomization, we uncover that reinforcement learning (RL) may show promise in occlusion-aware decision-making. However, previous occlusion-aware RL faces challenges in expanding to various dynamic and static occlusion scenarios, low learning efficiency, and lack of predictive ability. To address these issues, we introduce Pad-AI, a self-reinforcing framework to learn occlusion-aware decision-making through active perception. Pad-AI utilizes vectorized representation to represent occluded environments efficiently and learns over the semantic motion primitives to focus on high-level active perception exploration. Furthermore, Pad-AI integrates prediction and RL within a unified framework to provide risk-aware learning and security guarantees. Our framework was tested in challenging scenarios under both dynamic and static occlusions and demonstrated efficient and general perception-aware exploration performance to other strong baselines in closed-loop evaluations.
Efficient Navigation of a Robotic Fish Swimming Across the Vortical Flow Field
Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-structure interactions (FSI) caused by unsteady hydrodynamics. This study proposes a deep reinforcement learning (DRL) algorithm, trained in a data-driven manner, to enable efficient navigation of a robotic fish swimming across vortical flows. Our proposed algorithm incorporates the LSTM architecture and uses several recent consecutive observations as the state to address the issue of partial observation, often due to sensor limitations. We present a numerical study of navigation within a Karman vortex street, created by placing a stationary cylinder in a uniform flow, utilizing the immersed boundary-lattice Boltzmann method (IB-LBM). The aim is to train the robotic fish to discover efficient navigation policies, enabling it to reach a designated target point across the Karman vortex street from various initial positions. After training, the fish demonstrates the ability to rapidly reach the target from different initial positions, showcasing the effectiveness and robustness of our proposed algorithm. Analysis of the results reveals that the robotic fish can leverage velocity gains and pressure differences induced by the vortices to reach the target, underscoring the potential of our proposed algorithm in enhancing navigation in complex hydrodynamic environments.
comment: We would like to request the withdrawal of our submission due to some misunderstandings among the co-authors concerning the submission process. It appears that the current version was submitted before we reached a consensus among all authors. We are actively working to address these matters and plan to resubmit a revised version once we achieve agreement
Teaching Robots Where To Go And How To Act With Human Sketches via Spatial Diagrammatic Instructions
This paper introduces Spatial Diagrammatic Instructions (SDIs), an approach for human operators to specify objectives and constraints that are related to spatial regions in the working environment. Human operators are enabled to sketch out regions directly on camera images that correspond to the objectives and constraints. These sketches are projected to 3D spatial coordinates, and continuous Spatial Instruction Maps (SIMs) are learned upon them. These maps can then be integrated into optimization problems for tasks of robots. In particular, we demonstrate how Spatial Diagrammatic Instructions can be applied to solve the Base Placement Problem of mobile manipulators, which concerns the best place to put the manipulator to facilitate a certain task. Human operators can specify, via sketch, spatial regions of interest for a manipulation task and permissible regions for the mobile manipulator to be at. Then, an optimization problem that maximizes the manipulator's reachability, or coverage, over the designated regions of interest while remaining in the permissible regions is solved. We provide extensive empirical evaluations, and show that our formulation of Spatial Instruction Maps provides accurate representations of user-specified diagrammatic instructions. Furthermore, we demonstrate that our diagrammatic approach to the Mobile Base Placement Problem enables higher quality solutions and faster runtime.
Multi-Robot Coordination Induced in an Adversarial Graph-Traversal Game
This paper presents a game theoretic formulation of a graph traversal problem, with applications to robots moving through hazardous environments in the presence of an adversary, as in military and security scenarios. The blue team of robots moves in an environment modeled by a time-varying graph, attempting to reach some goal with minimum cost, while the red team controls how the graph changes to maximize the cost. The problem is formulated as a stochastic game, so that Nash equilibrium strategies can be computed numerically. Bounds are provided for the game value, with a guarantee that it solves the original problem. Numerical simulations demonstrate the results and the effectiveness of this method, particularly showing the benefit of mixing actions for both players, as well as beneficial coordinated behavior, where blue robots split up and/or synchronize to traverse risky edges.
comment: 8 pages, 8 figures
Detecting and Mitigating System-Level Anomalies of Vision-Based Controllers
Autonomous systems, such as self-driving cars and drones, have made significant strides in recent years by leveraging visual inputs and machine learning for decision-making and control. Despite their impressive performance, these vision-based controllers can make erroneous predictions when faced with novel or out-of-distribution inputs. Such errors can cascade to catastrophic system failures and compromise system safety. In this work, we introduce a run-time anomaly monitor to detect and mitigate such closed-loop, system-level failures. Specifically, we leverage a reachability-based framework to stress-test the vision-based controller offline and mine its system-level failures. This data is then used to train a classifier that is leveraged online to flag inputs that might cause system breakdowns. The anomaly detector highlights issues that transcend individual modules and pertain to the safety of the overall system. We also design a fallback controller that robustly handles these detected anomalies to preserve system safety. We validate the proposed approach on an autonomous aircraft taxiing system that uses a vision-based controller for taxiing. Our results show the efficacy of the proposed approach in identifying and handling system-level anomalies, outperforming methods such as prediction error-based detection, and ensembling, thereby enhancing the overall safety and robustness of autonomous systems.
Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance
We demonstrate the capabilities of an attention-based end-to-end approach for high-speed vision-based quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art learning architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional model-based approaches to navigation via independent perception, mapping, planning, and control modules breaks down due to increased sensor noise, compounding errors, and increased processing latency. Thus, learning-based, end-to-end vision-to-control networks have shown to have great potential for online control of these fast robots through cluttered environments. We train and compare convolutional, U-Net, and recurrent architectures against vision transformer (ViT) models for depth image-to-control in high-fidelity simulation, observing that ViT models are more effective than others as quadrotor speeds increase and in generalization to unseen environments, while the addition of recurrence further improves performance while reducing quadrotor energy cost across all tested flight speeds. We assess performance at speeds of up to 7m/s in simulation and hardware. To the best of our knowledge, this is the first work to utilize vision transformers for end-to-end vision-based quadrotor control.
comment: 11 pages, 18 figures, 3 tables (with supplementary)
In-Context Imitation Learning via Next-Token Prediction
We explore how to enhance next-token prediction models to perform in-context imitation learning on a real robot, where the robot executes new tasks by interpreting contextual information provided during the input phase, without updating its underlying policy parameters. We propose In-Context Robot Transformer (ICRT), a causal transformer that performs autoregressive prediction on sensorimotor trajectories without relying on any linguistic data or reward function. This formulation enables flexible and training-free execution of new tasks at test time, achieved by prompting the model with sensorimotor trajectories of the new task composing of image observations, actions and states tuples, collected through human teleoperation. Experiments with a Franka Emika robot demonstrate that the ICRT can adapt to new tasks specified by prompts, even in environment configurations that differ from both the prompt and the training data. In a multitask environment setup, ICRT significantly outperforms current state-of-the-art next-token prediction models in robotics on generalizing to unseen tasks. Code, checkpoints and data are available on https://icrt.dev/
Proprioception Is All You Need: Terrain Classification for Boreal Forests IROS 2024
Recent works in field robotics highlighted the importance of resiliency against different types of terrains. Boreal forests, in particular, are home to many mobility-impeding terrains that should be considered for off-road autonomous navigation. Also, being one of the largest land biomes on Earth, boreal forests are an area where autonomous vehicles are expected to become increasingly common. In this paper, we address this issue by introducing BorealTC, a publicly available dataset for proprioceptive-based terrain classification (TC). Recorded with a Husky A200, our dataset contains 116 min of Inertial Measurement Unit (IMU), motor current, and wheel odometry data, focusing on typical boreal forest terrains, notably snow, ice, and silty loam. Combining our dataset with another dataset from the state-of-the-art, we evaluate both a Convolutional Neural Network (CNN) and the novel state space model (SSM)-based Mamba architecture on a TC task. Interestingly, we show that while CNN outperforms Mamba on each separate dataset, Mamba achieves greater accuracy when trained on a combination of both. In addition, we demonstrate that Mamba's learning capacity is greater than a CNN for increasing amounts of data. We show that the combination of two TC datasets yields a latent space that can be interpreted with the properties of the terrains. We also discuss the implications of merging datasets on classification. Our source code and dataset are publicly available online: https://github.com/norlab-ulaval/BorealTC.
comment: Accepted to the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Universal Trajectory Optimization Framework for Differential Drive Robot Class
Differential drive robots are widely used in various scenarios thanks to their straightforward principle, from household service robots to disaster response field robots. There are several types of driving mechanisms for real-world applications, including two-wheeled, four-wheeled skid-steering, tracked robots, and so on. The differences in the driving mechanisms usually require specific kinematic modeling when precise control is desired. Furthermore, the nonholonomic dynamics and possible lateral slip lead to different degrees of difficulty in getting feasible and high-quality trajectories. Therefore, a comprehensive trajectory optimization framework to compute trajectories efficiently for various kinds of differential drive robots is highly desirable. In this paper, we propose a universal trajectory optimization framework that can be applied to differential drive robots, enabling the generation of high-quality trajectories within a restricted computational timeframe. We introduce a novel trajectory representation based on polynomial parameterization of motion states or their integrals, such as angular and linear velocities, which inherently matches the robots' motion to the control principle. The trajectory optimization problem is formulated to minimize complexity while prioritizing safety and operational efficiency. We then build a full-stack autonomous planning and control system to demonstrate its feasibility and robustness. We conduct extensive simulations and real-world testing in crowded environments with three kinds of differential drive robots to validate the effectiveness of our approach.
comment: 15 pages, 15 figures
AnySkin: Plug-and-play Skin Sensing for Robotic Touch
While tactile sensing is widely accepted as an important and useful sensing modality, its use pales in comparison to other sensory modalities like vision and proprioception. AnySkin addresses the critical challenges that impede the use of tactile sensing -- versatility, replaceability, and data reusability. Building on the simplistic design of ReSkin, and decoupling the sensing electronics from the sensing interface, AnySkin simplifies integration making it as straightforward as putting on a phone case and connecting a charger. Furthermore, AnySkin is the first uncalibrated tactile-sensor with cross-instance generalizability of learned manipulation policies. To summarize, this work makes three key contributions: first, we introduce a streamlined fabrication process and a design tool for creating an adhesive-free, durable and easily replaceable magnetic tactile sensor; second, we characterize slip detection and policy learning with the AnySkin sensor; and third, we demonstrate zero-shot generalization of models trained on one instance of AnySkin to new instances, and compare it with popular existing tactile solutions like DIGIT and ReSkin. Videos of experiments, fabrication details and design files can be found on https://any-skin.github.io/
Soft Acoustic Curvature Sensor: Design and Development
This paper introduces a novel Soft Acoustic Curvature (SAC) sensor. SAC incorporates integrated audio components and features an acoustic channel within a flexible structure. A reference acoustic wave, generated by a speaker at one end of the channel, propagates and is received by a microphone at the other channel's end. Our previous study revealed that acoustic wave energy dissipation varies with acoustic channel deformation, leading us to design a novel channel capable of large deformation due to bending. We then use Machine Learning (ML) models to establish a complex mapping between channel deformations and sound modulation. Various sound frequencies and ML models were evaluated to enhance curvature detection accuracy. The sensor, constructed using soft material and 3D printing, was validated experimentally, with curvature measurement errors remaining within 3.5 m-1 for a range of 0 to 60 m-1 curvatures. These results demonstrate the effectiveness of the proposed method for estimating curvatures. With its flexible structure, the SAC sensor holds potential for applications in soft robotics, including shape measurement for continuum manipulators, soft grippers, and wearable devices.
comment: To appear in Robotics and Automation Letter
Deep Bayesian Future Fusion for Self-Supervised, High-Resolution, Off-Road Mapping
High-speed off-road navigation requires long-range, high-resolution maps to enable robots to safely navigate over different surfaces while avoiding dangerous obstacles. However, due to limited computational power and sensing noise, most approaches to off-road mapping focus on producing coarse (20-40cm) maps of the environment. In this paper, we propose Future Fusion, a framework capable of generating dense, high-resolution maps from sparse sensing data (30m forward at 2cm). This is accomplished by - (1) the efficient realization of the well-known Bayes filtering within the standard deep learning models that explicitly accounts for the sparsity pattern in stereo and LiDAR depth data, and (2) leveraging perceptual losses common in generative image completion. The proposed methodology outperforms the conventional baselines. Moreover, the learned features and the completed dense maps lead to improvements in the downstream navigation task.
Learning Adaptive Multi-Objective Robot Navigation Incorporating Demonstrations
Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing user feedback or demonstrations for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with static reward functions often fall short in adapting to these varying user preferences, inevitably reflecting demonstrations once training is completed. This paper introduces a framework that combines multi-objective reinforcement learning (MORL) with demonstration-based learning. Our approach allows for dynamic adaptation to changing user preferences without retraining. It fluently modulates between reward-defined preference objectives and the amount of demonstration data reflection. Through rigorous evaluations, including a sim-to-real transfer on two robots, we demonstrate our framework's capability to reflect user preferences accurately while achieving high navigational performance in terms of collision avoidance and goal pursuance.
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor control and instruction comprehension through end-to-end learning processes. However, current VLA models face significant challenges: they are slow during inference and require extensive pre-training on large amounts of robotic data, making real-world deployment difficult. In this paper, we introduce a new family of compact vision-language-action models, called TinyVLA, which offers two key advantages over existing VLA models: (1) faster inference speeds, and (2) improved data efficiency, eliminating the need for pre-training stage. Our framework incorporates two essential components to build TinyVLA: (1) initializing the policy backbone with robust, high-speed multimodal models, and (2) integrating a diffusion policy decoder during fine-tuning to enable precise robot actions. We conducted extensive evaluations of TinyVLA in both simulation and on real robots, demonstrating that our approach significantly outperforms the state-of-the-art VLA model, OpenVLA, in terms of speed and data efficiency, while delivering comparable or superior performance. Additionally, TinyVLA exhibits strong generalization capabilities across various dimensions, including language instructions, novel objects, unseen positions, changes in object appearance, background variations, and environmental shifts, often matching or exceeding the performance of OpenVLA. We believe that \methodname offers an interesting perspective on utilizing pre-trained multimodal models for policy learning. Our project is at https://tiny-vla.github.io.
comment: add more citations
FracGM: A Fast Fractional Programming Technique for Geman-McClure Robust Estimator
Robust estimation is essential in computer vision, robotics, and navigation, aiming to minimize the impact of outlier measurements for improved accuracy. We present a fast algorithm for Geman-McClure robust estimation, FracGM, leveraging fractional programming techniques. This solver reformulates the original non-convex fractional problem to a convex dual problem and a linear equation system, iteratively solving them in an alternating optimization pattern. Compared to graduated non-convexity approaches, this strategy exhibits a faster convergence rate and better outlier rejection capability. In addition, the global optimality of the proposed solver can be guaranteed under given conditions. We demonstrate the proposed FracGM solver with Wahba's rotation problem and 3-D point-cloud registration along with relaxation pre-processing and projection post-processing. Compared to state-of-the-art algorithms, when the outlier rates increase from 20% to 80%, FracGM shows 53% and 88% lower rotation and translation increases. In real-world scenarios, FracGM achieves better results in 13 out of 18 outcomes, while having a 19.43% improvement in the computation time.
comment: 8 pages, 6 figures
Multiagent Systems
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Facility Location Problem with Aleatory Agents
In this paper, we introduce and study the Facility Location Problem with Aleatory Agents (FLPAA), where the facility accommodates n agents larger than the number of agents reporting their preferences, namely n_r. The spare capacity is used by n_u=n-n_r aleatory agents sampled from a probability distribution \mu. The goal of FLPAA is to find a location that minimizes the ex-ante social cost, which is the expected cost of the n_u agents sampled from \mu plus the cost incurred by the agents reporting their position. We investigate the mechanism design aspects of the FLPAA under the assumption that the Mechanism Designer (MD) lacks knowledge of the distribution $\mu$ but can query k quantiles of \mu. We explore the trade-off between acquiring more insights into the probability distribution and designing a better-performing mechanism, which we describe through the strong approximation ratio (SAR). The SAR of a mechanism measures the highest ratio between the cost of the mechanisms and the cost of the optimal solution on the worst-case input x and worst-case distribution \mu, offering a metric for efficiency that does not depend on \mu. We divide our study into four different information settings: the zero information case, in which the MD has access to no quantiles; the median information case, in which the MD has access to the median of \mu; the n_u-quantile information case, in which the MD has access to n_u quantiles of its choice, and the k-quantile information case, in which the MD has access to k
comment: 27 pages, 2 figures
Toward Universal and Interpretable World Models for Open-ended Learning Agents
We introduce a generic, compositional and interpretable class of generative world models that supports open-ended learning agents. This is a sparse class of Bayesian networks capable of approximating a broad range of stochastic processes, which provide agents with the ability to learn world models in a manner that may be both interpretable and computationally scalable. This approach integrating Bayesian structure learning and intrinsically motivated (model-based) planning enables agents to actively develop and refine their world models, which may lead to open-ended learning and more robust, adaptive behavior.
comment: 4 pages including appendix, 6 including appendix and references; 2 figures
Intention-aware policy graphs: answering what, how, and why in opaque agents
Agents are a special kind of AI-based software in that they interact in complex environments and have increased potential for emergent behaviour. Explaining such emergent behaviour is key to deploying trustworthy AI, but the increasing complexity and opaque nature of many agent implementations makes this hard. In this work, we propose a Probabilistic Graphical Model along with a pipeline for designing such model -- by which the behaviour of an agent can be deliberated about -- and for computing a robust numerical value for the intentions the agent has at any moment. We contribute measurements that evaluate the interpretability and reliability of explanations provided, and enables explainability questions such as `what do you want to do now?' (e.g. deliver soup) `how do you plan to do it?' (e.g. returning a plan that considers its skills and the world), and `why would you take this action at this state?' (e.g. explaining how that furthers or hinders its own goals). This model can be constructed by taking partial observations of the agent's actions and world states, and we provide an iterative workflow for increasing the proposed measurements through better design and/or pointing out irrational agent behaviour.
comment: 57 pages, 8 figures, 5 tables
Multi-agent Reinforcement Learning for Dynamic Dispatching in Material Handling Systems
This paper proposes a multi-agent reinforcement learning (MARL) approach to learn dynamic dispatching strategies, which is crucial for optimizing throughput in material handling systems across diverse industries. To benchmark our method, we developed a material handling environment that reflects the complexities of an actual system, such as various activities at different locations, physical constraints, and inherent uncertainties. To enhance exploration during learning, we propose a method to integrate domain knowledge in the form of existing dynamic dispatching heuristics. Our experimental results show that our method can outperform heuristics by up to 7.4 percent in terms of median throughput. Additionally, we analyze the effect of different architectures on MARL performance when training multiple agents with different functions. We also demonstrate that the MARL agents performance can be further improved by using the first iteration of MARL agents as heuristics to train a second iteration of MARL agents. This work demonstrates the potential of applying MARL to learn effective dynamic dispatching strategies that may be deployed in real-world systems to improve business outcomes.
Explaining Explaining
Explanation is key to people having confidence in high-stakes AI systems. However, machine-learning-based systems -- which account for almost all current AI -- can't explain because they are usually black boxes. The explainable AI (XAI) movement hedges this problem by redefining "explanation". The human-centered explainable AI (HCXAI) movement identifies the explanation-oriented needs of users but can't fulfill them because of its commitment to machine learning. In order to achieve the kinds of explanations needed by real people operating in critical domains, we must rethink how to approach AI. We describe a hybrid approach to developing cognitive agents that uses a knowledge-based infrastructure supplemented by data obtained through machine learning when applicable. These agents will serve as assistants to humans who will bear ultimate responsibility for the decisions and actions of the human-robot team. We illustrate the explanatory potential of such agents using the under-the-hood panels of a demonstration system in which a team of simulated robots collaborate on a search task assigned by a human.
Plurals: A System for Guiding LLMs Via Simulated Social Ensembles
Recent debates raised concerns that language models may favor certain viewpoints. But what if the solution is not to aim for a 'view from nowhere' but rather to leverage different viewpoints? We introduce Plurals, a system and Python library for pluralistic AI deliberation. Plurals consists of Agents (LLMs, optionally with personas) which deliberate within customizable Structures, with Moderators overseeing deliberation. Plurals is a generator of simulated social ensembles. Plurals integrates with government datasets to create nationally representative personas, includes deliberation templates inspired by democratic deliberation theory, and allows users to customize both information-sharing structures and deliberation behavior within Structures. Six case studies demonstrate fidelity to theoretical constructs and efficacy. Three randomized experiments show simulated focus groups produced output resonant with an online sample of the relevant audiences (chosen over zero-shot generation in 75% of trials). Plurals is both a paradigm and a concrete system for pluralistic AI. The Plurals library is available at https://github.com/josh-ashkinaze/plurals and will be continually updated.
STROOBnet Optimization via GPU-Accelerated Proximal Recurrence Strategies
Spatiotemporal networks' observational capabilities are crucial for accurate data gathering and informed decisions across multiple sectors. This study focuses on the Spatiotemporal Ranged Observer-Observable Bipartite Network (STROOBnet), linking observational nodes (e.g., surveillance cameras) to events within defined geographical regions, enabling efficient monitoring. Using data from Real-Time Crime Camera (RTCC) systems and Calls for Service (CFS) in New Orleans, where RTCC combats rising crime amidst reduced police presence, we address the network's initial observational imbalances. Aiming for uniform observational efficacy, we propose the Proximal Recurrence approach. It outperformed traditional clustering methods like k-means and DBSCAN by offering holistic event frequency and spatial consideration, enhancing observational coverage.
comment: 10 pages, 17 figures, 2023 IEEE International Conference on Big Data (BigData)
A Stochastic Geo-spatiotemporal Bipartite Network to Optimize GCOOS Sensor Placement Strategies
This paper proposes two new measures applicable in a spatial bipartite network model: coverage and coverage robustness. The bipartite network must consist of observer nodes, observable nodes, and edges that connect observer nodes to observable nodes. The coverage and coverage robustness scores evaluate the effectiveness of the observer node placements. This measure is beneficial for stochastic data as it may be coupled with Monte Carlo simulations to identify optimal placements for new observer nodes. In this paper, we construct a Geo-SpatioTemporal Bipartite Network (GSTBN) within the stochastic and dynamical environment of the Gulf of Mexico. This GSTBN consists of GCOOS sensor nodes and HYCOM Region of Interest (RoI) event nodes. The goal is to identify optimal placements to expand GCOOS to improve the forecasting outcomes by the HYCOM ocean prediction model.
comment: 7 pages, 6 figures, 2022 IEEE International Conference on Big Data (Big Data)
Systems and Control (CS)
SensoPatch: A Reconfigurable Haptic Feedback with High-Density Tactile Sensing Glove
Haptic feedback is integral to the improved experience of prosthetic users and the reduction in prosthesis rejection. Prior studies have explored various methods to encode tactile information and deliver vibration feedback. However, a comprehensive study comparing performance across different stimulation locations and feedback modalities for wearable devices is absent and there is no test platform. This paper proposes an open-source reconfigurable haptic feedback system which incorporates 25 sensors and wireless communication to allow customized number of vibration motors, adjustable motor placement, and programmable encoding of tactile data to change feedback modalities. To demonstrate potential studies that can be investigated using SensoPatch, we conducted two experiments: 1) to assess the vibration discrimination accuracy on 3 body parts 2) to assess the effect of 6 methods of mapping tactile data to varying number of motors on object manipulation. SensoPatch utilizes low-cost off-the-shelf components, enabling large-scale comparative studies of feedback modalities and stimulation sites to optimize vibrotactile feedback and facilitate its deployment in upper limb prostheses.
comment: 5 pages, 5 figures, 1 table, to be published in 2024 IEEE Biomedical Circuits and Systems Conference (BioCAS)
Towards Energy- and Cost-Efficient 6G Networks
As the world enters the journey toward the 6th generation (6G) of wireless technology, the promises of ultra-high data rates, unprecedented low latency, and a massive surge in connected devices require crucial exploration of network energy saving (NES) solutions to minimize the carbon footprint and overall energy usage of future cellular networks. On the other hand, network-controlled repeaters (NCRs) have been introduced by 3rd generation partnership project (3GPP) as a cost-effective solution to improve network coverage. However, their impact on network power consumption and energy efficiency has not been thoroughly investigated. This paper studies NES schemes for next-generation 6G networks aided by NCRs and proposes optimal NES strategies aiming at maximizing the overall energy efficiency of the network. Repeaters are shown to allow for power savings at next-generation nodeB (gNB), and offer higher overall energy efficiency (EE) and spectral efficiency (SE), thus providing an energy-efficient and cost-efficient alternative to increase the performance of future 6G networks
comment: 7 pages, conference
Calibrating microscopic traffic models with macroscopic data
Traffic microsimulation is a crucial tool that uses microscopic traffic models, such as car-following and lane-change models, to simulate the trajectories of individual agents. This digital platform allows for the assessment of the impact of emerging technologies on transportation system performance. While these microscopic models are based on mathematical structures, their parameters must be fitted to real-world data through a process called model calibration. Despite extensive studies on calibration, the focus has predominantly been on fitting microscopic data, such as trajectories, rather than evaluating how well the models reproduce macroscopic traffic patterns, such as congestion, bottlenecks, and traffic waves. In this work, we address this gap by calibrating microscopic traffic flow models using macroscopic (aggregated) data, which is more readily accessible. We designed a SUMO-in-the-loop calibration framework with the goal of replicating observed macroscopic traffic features. To assess calibration accuracy, we developed a set of performance measures that evaluate the models' ability to replicate traffic states across the entire spatiotemporal domain and other qualitative characteristics of traffic flow. The calibration method was applied to both a synthetic scenario and a real-world scenario on a segment of Interstate 24, to demonstrate its effectiveness in reproducing observed traffic patterns.
Improved formulation for long-duration storage in capacity expansion models using representative periods
With the increasing complexity and size of capacity expansion models, temporal aggregation has emerged as a common method to improve computational tractability. However, this approach inherently complicates the inclusion of long-duration storage (LDS) systems, whose operation involves the entire time horizon connecting all time steps. This work presents a detailed investigation of LDS modelling with temporal aggregation. A novel compact formulation is proposed to reduce the number of constraints while effectively tracking the storage content and enforcing limits on the state of charge throughout the entire time horizon. The developed method is compared with two leading state-of-the-art formulations. All three methods are implemented in the Dolphyn capacity expansion model and tested on a case study for the continental United States, considering different configurations in terms of spatial resolutions and representative periods. The performance is assessed with both the commercial solver Gurobi and the open-source solver HiGHS. Results show that the developed compact formulation consistently outperforms the other methods in terms of both runtime (30%-70% faster than other methods) and memory usage (1%-9% lower than other methods).
Joint Optimization of Pattern, Headway, and Fleet Size of Multiple Urban Transit Lines with Perceived Headway Consideration and Passenger Flow Allocation
This study addresses the urban transit pattern design problem, optimizing stop sequences, headways, and fleet sizes across multiple routes simultaneously to minimize user costs (composed of riding, waiting, and transfer times) under operational constraints (e.g., vehicle capacity and fleet size). A destination-labeled multi-commodity network flow (MCNF) formulation is developed to solve the problem at a large scale more efficiently compared to the previous literature. The model allows for flexible pattern options without relying on pre-defined candidate sets and simultaneously considers multiple operational strategies such as express/local services, short-turning, and deadheading. It evaluates perceived headways of joint patterns for passengers, assigns passenger flows to each pattern accordingly, and allows transfers across patterns in different directions. The mixed-integer linear programming (MILP) model is demonstrated with a city-sized network of metro lines in Chicago, USA, achieving near-optimal solutions in hours. The total weighted journey times are reduced by 0.61% and 4.13% under single-route and multi-route scenarios respectively. The model provides transit agencies with an efficient tool for comprehensive service design and resource allocation, improving service quality and resource utilization without additional operational costs.
comment: 23 pages, 3 figures, a previous version accepted for presentation in the 104th Transportation Research Board Annual Meeting in Washington, D.C. in January 2025
Robust Proximity Operations using Probabilistic Markov Models ICRA 2025
A Markov decision process-based state switching is devised, implemented, and analyzed for proximity operations of various autonomous vehicles. The framework contains a pose estimator along with a multi-state guidance algorithm. The unified pose estimator leverages the extended Kalman filter for the fusion of measurements from rate gyroscopes, monocular vision, and ultra-wideband radar sensors. It is also equipped with Mahalonobis distance-based outlier rejection and under-weighting of measurements for robust performance. The use of probabilistic Markov models to transition between various guidance modes is proposed to enable robust and efficient proximity operations. Finally, the framework is validated through an experimental analysis of the docking of two small satellites and the precision landing of an aerial vehicle.
comment: This work has been submitted to the IEEE ICRA 2025 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Accompanying video : https://youtu.be/8-fetyf_SrM. arXiv admin note: text overlap with arXiv:2409.09665
Robust Deep Reinforcement Learning for Volt-VAR Optimization in Active Distribution System under Uncertainty
The deep reinforcement learning (DRL) based Volt-VAR optimization (VVO) methods have been widely studied for active distribution networks (ADNs). However, most of them lack safety guarantees in terms of power injection uncertainties due to the increase in distributed energy resources (DERs) and load demand, such as electric vehicles. This article proposes a robust deep reinforcement learning (RDRL) framework for VVO via a robust deep deterministic policy gradient (DDPG) algorithm. This algorithm can effectively manage hybrid action spaces, considering control devices like capacitors, voltage regulators, and smart inverters. Additionally, it is designed to handle uncertainties by quantifying uncertainty sets with conformal prediction and modeling uncertainties as adversarial attacks to guarantee safe exploration across action spaces. Numerical results on three IEEE test cases demonstrate the sample efficiency and safety of the proposed robust DDPG against uncertainties compared to the benchmark algorithms.
Robust and efficient data-driven predictive control
We propose a robust and efficient data-driven predictive control (eDDPC) scheme which is more sample efficient (requires less offline data) compared to existing schemes, and is also computationally efficient. This is done by leveraging an alternative data-based representation of the trajectories of linear time-invariant (LTI) systems. The proposed scheme relies only on using (short and potentially irregularly measured) noisy input-output data, the amount of which is independent of the prediction horizon. To account for measurement noise, we provide a novel result that quantifies the uncertainty between the true (unknown) restricted behavior of the system and the estimated one from noisy data. Furthermore, we show that the robust eDDPC scheme is recursively feasible and that the resulting closed-loop system is practically stable. Finally, we compare the performance of this scheme to existing ones on a case study of a four tank system.
comment: 17 pages, 2 figures, submitted for Automatica
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Path Following Model Predictive Control of a Coupled Autonomous Underwater Vehicle
The operation of an autonomous underwater vehicle (AUV) faces challenges in following predetermined waypoints due to coupled motions under environmental disturbances. To address this, a 3D path following guidance and control system is developed in this work based on the line-of-sight (LOS) guidance method. Conventionally, the 3D path following problem is transformed into heading and depth control problems, assuming that the motion of the vehicle is decoupled in horizontal and depth coordinates. The proposed control system design avoids this simplifying assumption by transforming the problem into a 3D position and orientation tracking problem. This design is achieved by computing a 2D horizontal coordinate based on the desired heading and then computing a corresponding LOS depth coordinate. A model predictive controller (MPC) is then implemented using the 3D LOS coordinate and the computed orientation vector. The MPC obtains a robust control by solving a minimax optimisation problem considering the effects of unknown ocean disturbances. The effectiveness of the proposed guidance and control system is demonstrated through the simulation of a prototype AUV system. Numerical results show that the AUV can follow predetermined waypoints in the presence of time-varying disturbances, and the system is steered at a constant surge speed that is proportional to the radius of the circle of acceptance used to implement the guidance system.
comment: 6 pages, 4 figures, Presented at the IFAC CAMS 2024, Virginia, USA
Hierarchical Federated ADMM
In this paper, we depart from the widely-used gradient descent-based hierarchical federated learning (FL) algorithms to develop a novel hierarchical FL framework based on the alternating direction method of multipliers (ADMM). Within this framework, we propose two novel FL algorithms, which both use ADMM in the top layer: one that employs ADMM in the lower layer and another that uses the conventional gradient descent-based approach. The proposed framework enhances privacy, and experiments demonstrate the superiority of the proposed algorithms compared to the conventional algorithms in terms of learning convergence and accuracy. Additionally, gradient descent on the lower layer performs well even if the number of local steps is very limited, while ADMM on both layers lead to better performance otherwise.
Asymptotic tracking control of dynamic reference over homomorphically encrypted data with finite modulus
This paper considers a tracking control problem, in which the dynamic controller is encrypted with an additively homomorphic encryption scheme and the output of a process tracks a dynamic reference asymptotically. Our paper is motivated by the following problem: When dealing with both asymptotic tracking and dynamic reference, we find that the control input is generally subject to overflow issues under a finite modulus, though the dynamic controller consists of only integer coefficients. First, we provide a new controller design method such that the coefficients of the tracking controller can be transformed into integers leveraging the zooming-in factor of dynamic quantization. By the Cayley-Hamilton theorem, we represent the control input as linear combination of the previous control inputs. Leveraging the property above, we design an algorithm on the actuator side such that it can restore the control input from the lower bits under a finite modulus. A lower bound of the modulus is also provided. As an extension of the first result, we further solve the problem of unbounded internal state taking place in the actuator. In particular, the actuator can restore the correct control input under the same modulus. A simulation example is provided to verify the control schemes proposed in our paper.
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning the parameters of a dynamical system model. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a novel neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Networks (ESNs) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
comment: 21 pages, 9 figures
Dual Pricing to Prioritize Renewable Energy and Consumer Preferences in Electricity Markets
Electricity markets currently fail to incorporate preferences of buyers, treating polluting and renewable energy sources as having equal social benefit under a system of uniform clearing prices. Meanwhile, renewable energy is prone to curtailment due to transmission constraints, forcing grid operators to reduce or shut down renewable energy production despite its availability and need. This paper proposes a ``dual pricing mechanism" which allows buyers to bid both their willingness to pay for electricity, and additionally, their preference for green energy. Designed for use in deregulated electricity markets, this mechanism prioritizes the dispatch of more renewable energy sources according to consumer preferences. Traditional uniform clearing prices, which treat all energy sources equally, do not reflect the growing share of green energy in the power grid and the environmental values of consumers. By allowing load-serving entities to bid their willingness to pay for renewable energy directly into the clearing market, our proposed framework generates distinct pricing signals for green and ``black" electricity.
Transparency evaluation for the Kinematic Design of the Harnesses through Human-Exoskeleton Interaction Modeling
Lower Limb Exoskeletons (LLEs) are wearable robots that provide mechanical power to the user. Human-exoskeleton (HE) connections must preserve the user's natural behavior during the interaction, avoiding undesired forces. Therefore, numerous works focus on their minimization. Given the inherent complications of repeatedly prototyping and experimentally testing a device, modeling the exoskeleton and its physical interaction with the user emerges as a valuable approach for assessing the design effects. This paper proposes a novel method to compare different exoskeleton configurations with a flexible simulation tool. This approach contemplates simulating the dynamics of the device, including its interaction with the wearer, to evaluate multiple connection mechanism designs along with the kinematics and actuation of the LLE. This evaluation is based on the minimization of the interaction wrenches through an optimization process that includes the impedance parameters at the interfaces as optimization variables and the similarity of the LLE's joint variables trajectories with the motion of the wearer's articulations. Exploratory tests are conducted using the Wearable Walker LLE in different configurations and measuring the interaction forces. Experimental data are then compared to the optimization outcomes, proving that the proposed method provides contact wrench estimations consistent with the collected measurements and previous outcomes from the literature. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
A History-Guided Regional Partitioning Evolutionary Optimization for Solving the Flexible Job Shop Problem with Limited Multi-load Automated Guided Vehicles
In a flexible job shop environment, using Automated Guided Vehicles (AGVs) to transport jobs and process materials is an important way to promote the intelligence of the workshop. Compared with single-load AGVs, multi-load AGVs can improve AGV utilization, reduce path conflicts, etc. Therefore, this study proposes a history-guided regional partitioning algorithm (HRPEO) for the flexible job shop scheduling problem with limited multi-load AGVs (FJSPMA). First, the encoding and decoding rules are designed according to the characteristics of multi-load AGVs, and then the initialization rule based on the branch and bound method is used to generate the initial population. Second, to prevent the algorithm from falling into a local optimum, the algorithm adopts a regional partitioning strategy. This strategy divides the solution space into multiple regions and measures the potential of the regions. After that, cluster the regions into multiple clusters in each iteration, and selects individuals for evolutionary search based on the set of clusters. Third, a local search strategy is designed to improve the exploitation ability of the algorithm, which uses a greedy approach to optimize machines selection and transportation sequence according to the characteristics of FJSPMA. Finally, a large number of experiments are carried out on the benchmarks to test the performance of the algorithm. Compared with multiple advanced algorithms, the results show that the HRPEO has a better advantage in solving FJSPMA.
comment: 14 pages
On Adaptive Frequency Sampling for Data-driven MOR Applied to Antenna Responses
Frequency domain sweeps of array antennas are well-known to be time-intensive, and different surrogate models have been used to improve the performance. Data-driven model order reduction algorithms, such as the Loewner framework and vector fitting, can be integrated with these adaptive error estimates, in an iterative algorithm, to reduce the number of full-wave simulations required to accurately capture the requested frequency behavior of multiport array antennas. In this work, we propose two novel adaptive methods exploiting a block matrix function which is a key part of the Loewner framework generating system approach. The first algorithm leverages an inherent matrix parameter freedom in the block matrix function to identify frequency points with large errors, whereas the second utilizes the condition number of the block matrix function. Both methods effectively provide frequency domain error estimates, essential for improved performance. Numerical experiments on multiport array antenna S-parameters demonstrate the effectiveness of our proposed algorithms within the Loewner framework.
comment: 10 pages, 12 figures
Pseudometrics for scalable data-driven comparisons of nonlinear dynamical systems
Novel solutions for pseudometrics quantifying deviation from topological conjugacy between dynamical systems are presented. Deviation from conjugacy is quantified in a Pareto optimal sense that accounts for spectral properties of Koopman operators as well as trajectory geometry. Theoretical justification is provided for computing such pseudometrics in Koopman eigenfunction space rather than observable space. Furthermore, it is shown deriving the pseudometrics from unitary transformations is sufficient to recover a value of zero if two systems are topologically conjugate. Therefore the pseudometrics for quantifying deviation from conjugacy are based on analytical solutions for unitary transformations in Koopman eigenfunction space. Finally, geometric considerations for the Pareto optimality problem associated with deviation from conjugacy are used to develop pseudometrics that account for all possible solutions given just two Pareto points based on analytical solutions.
Impact of number of elements on the directivity of planar array of monopole antenna
This research investigates how the number of elements affects the monopole antenna's planar array's directivity. This study also takes into account the antenna's effect on the whole field it radiates. The monopole antennas are arranged in a planar configuration with all the components in their proper locations using the Hadamard matrix approach. Each matrix's directivities and array factors were calculated, and a MATLAB tool was used to simulate the radiation pattern. A range of elements from 4 X 4 to 50 X 50 planar layouts were taken into consideration during the investigation. Increasing the number of elements improves the directivity. Increasing the number of elements in the planar array resulted in a great improvement in directivity, as seen by the computed and simulated results. Consequently, by increasing the antenna's directivity, a greater number of elements influences the overall field emitted.
comment: 8 pages, 19 Figures, article
Pseudo-kinematic trajectory control of tracked vehicles
Tracked vehicles are used in complex scenarios, where motion planning and navigation can be very complex. They have complex dynamics, with many parameters that are difficult to identify and that change significantly based on the operating conditions. We propose a simple pseudo-kinematic model, where the intricate dynamic effects underlying the vehicle's motion are captured in a small set of velocity-dependent parameters. This choice enables the development of a Lyapunov-based trajectory controller with guaranteed performance and small computation time. We demonstrate the correctness of our approach with both simulation and experimental data.
Towards Event-Triggered NMPC for Efficient 6G Communications: Experimental Results and Open Problems
Networked control systems enable real-time control and coordination of distributed systems, leveraging the low latency, high reliability, and massive connectivity offered by 5G and future 6G networks. Applications include autonomous vehicles, robotics, industrial automation, and smart grids. Despite networked control algorithms admitting nominal stability guarantees even in the presence of delays and packet dropouts, their practical performance still heavily depends on the specific characteristics and conditions of the underlying network. To achieve the desired performance while efficiently using communication resources, co-design of control and communication is pivotal. Although periodic schemes, where communication instances are fixed, can provide reliable control performance, unnecessary transmissions, when updates are not needed, result in inefficient usage of network resources. In this paper, we investigate the potential for co-design of model predictive control and network communication. To this end, we design and implement an event-triggered nonlinear model predictive controller for stabilizing a Furuta pendulum communicating over a tailored open radio access network 6G research platform. We analyze the control performance as well as network utilization under varying channel conditions and event-triggering criteria. Our results show that the event-triggered control scheme achieves similar performance to periodic control with reduced communication demand.
Analysis of Truncated Singular Value Decomposition for Koopman Operator-Based Lane Change Model
Understanding and modeling complex dynamic systems is crucial for enhancing vehicle performance and safety, especially in the context of autonomous driving. Recently, popular methods such as Koopman operators and their approximators, known as Extended Dynamic Mode Decomposition (EDMD), have emerged for their effectiveness in transforming strongly nonlinear system behavior into linear representations. This allows them to be integrated with conventional linear controllers. To achieve this, Singular Value Decomposition (SVD), specifically truncated SVD, is employed to approximate Koopman operators from extensive datasets efficiently. This study evaluates different basis functions used in EDMD and ranks for truncated SVD for representing lane change behavior models, aiming to balance computational efficiency with information loss. The findings, however, suggest that the technique of truncated SVD does not necessarily achieve substantial reductions in computational training time and results in significant information loss.
comment: Submitted to the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024)
Unscented Transform-based Pure Pursuit Path-Tracking Algorithm under Uncertainty
Automated driving has become more and more popular due to its potential to eliminate road accidents by taking over driving tasks from humans. One of the remaining challenges is to follow a planned path autonomously, especially when uncertainties in self-localizing or understanding the surroundings can influence the decisions made by autonomous vehicles, such as calculating how much they need to steer to minimize tracking errors. In this paper, a modified geometric pure pursuit path-tracking algorithm is proposed, taking into consideration such uncertainties using the unscented transform. The algorithm is tested through simulations for typical road geometries, such as straight and circular lines.
comment: Submitted to the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024)
CaΣoS: A nonlinear sum-of-squares optimization suite
We present Ca{\Sigma}oS, the first MATLAB software specifically designed for nonlinear sum-of-squares optimization. A symbolic polynomial algebra system allows to formulate parametrized sum-of-squares optimization problems and facilitates their fast, repeated evaluations. To that extent, we make use of CasADi's symbolic framework and realize concepts of monomial sparsity, linear operators (including duals), and functions between polynomials. Ca{\Sigma}oS currently provides interfaces to the conic solvers SeDuMi, Mosek, and SCS as well as methods to solve quasiconvex optimization problems (via bisection) and nonconvex optimization problems (via sequential convexification). Numerical examples for benchmark problems including region-of-attraction and reachable set estimation for nonlinear dynamic systems demonstrate significant improvements in computation time compared to existing toolboxes.. Ca{\Sigma}oS is available open-source at https://github.com/ ifr-acso/casos.
comment: Submitted to 2025 American Control Conference
Adaptive Knowledge-based Multi-Objective Evolutionary Algorithm for Hybrid Flow Shop Scheduling Problems with Multiple Parallel Batch Processing Stages
Parallel batch processing machines have extensive applications in the semiconductor manufacturing process. However, the problem models in previous studies regard parallel batch processing as a fixed processing stage in the machining process. This study generalizes the problem model, in which users can arbitrarily set certain stages as parallel batch processing stages according to their needs. A Hybrid Flow Shop Scheduling Problem with Parallel Batch Processing Machines (PBHFSP) is solved in this paper. Furthermore, an Adaptive Knowledge-based Multi-Objective Evolutionary Algorithm (AMOEA/D) is designed to simultaneously optimize both makespan and Total Energy Consumption (TEC). Firstly, a hybrid initialization strategy with heuristic rules based on knowledge of PBHFSP is proposed to generate promising solutions. Secondly, the disjunctive graph model has been established based on the knowledge to find the critical-path of PBHFS. Then, a critical-path based neighborhood search is proposed to enhance the exploitation ability of AMOEA/D. Moreover, the search time is adaptively adjusted based on learning experience from Q-learning and Decay Law. Afterward, to enhance the exploration capability of the algorithm, AMOEA/D designs an improved population updating strategy with a weight vector updating strategy. These strategies rematch individuals with weight vectors, thereby maintaining the diversity of the population. Finally, the proposed algorithm is compared with state-of-the-art algorithms. The experimental results show that the AMOEA/D is superior to the comparison algorithms in solving the PBHFSP.
comment: 12 pages
CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models ICRA 2025
Curriculum learning is a training mechanism in reinforcement learning (RL) that facilitates the achievement of complex policies by progressively increasing the task difficulty during training. However, designing effective curricula for a specific task often requires extensive domain knowledge and human intervention, which limits its applicability across various domains. Our core idea is that large language models (LLMs), with their extensive training on diverse language data and ability to encapsulate world knowledge, present significant potential for efficiently breaking down tasks and decomposing skills across various robotics environments. Additionally, the demonstrated success of LLMs in translating natural language into executable code for RL agents strengthens their role in generating task curricula. In this work, we propose CurricuLLM, which leverages the high-level planning and programming capabilities of LLMs for curriculum design, thereby enhancing the efficient learning of complex target tasks. CurricuLLM consists of: (Step 1) Generating sequence of subtasks that aid target task learning in natural language form, (Step 2) Translating natural language description of subtasks in executable task code, including the reward code and goal distribution code, and (Step 3) Evaluating trained policies based on trajectory rollout and subtask description. We evaluate CurricuLLM in various robotics simulation environments, ranging from manipulation, navigation, and locomotion, to show that CurricuLLM can aid learning complex robot control tasks. In addition, we validate humanoid locomotion policy learned through CurricuLLM in real-world. The code is provided in https://github.com/labicon/CurricuLLM
comment: Submitted to ICRA 2025
Diffusion Models for Intelligent Transportation Systems: A Survey
Intelligent Transportation Systems (ITS) are vital in modern traffic management and optimization, significantly enhancing traffic efficiency and safety. Recently, diffusion models have emerged as transformative tools for addressing complex challenges within ITS. In this paper, we present a comprehensive survey of diffusion models for ITS, covering both theoretical and practical aspects. First, we introduce the theoretical foundations of diffusion models and their key variants, including conditional diffusion models and latent diffusion models, highlighting their suitability for modeling complex, multi-modal traffic data and enabling controllable generation. Second, we outline the primary challenges in ITS and the corresponding advantages of diffusion models, providing readers with a deeper understanding of the intersection between ITS and diffusion models. Third, we offer a multi-perspective investigation of current applications of diffusion models in ITS domains, including autonomous driving, traffic simulation, trajectory prediction, and traffic safety. Finally, we discuss state-of-the-art diffusion model techniques and highlight key ITS research directions that warrant further investigation. Through this structured overview, we aim to provide researchers with a comprehensive understanding of diffusion models for ITS, thereby advancing their future applications in the transportation domain.
comment: 7 figures
Enabling On-Chip High-Frequency Adaptive Linear Optimal Control via Linearized Gaussian Process
Unpredictable and complex aerodynamic effects pose significant challenges to achieving precise flight control, such as the downwash effect from upper vehicles to lower ones. Conventional methods often struggle to accurately model these interactions, leading to controllers that require large safety margins between vehicles. Moreover, the controller on real drones usually requires high-frequency and has limited on-chip computation, making the adaptive control design more difficult to implement. To address these challenges, we incorporate Gaussian process (GP) to model the adaptive external aerodynamics with linear model predictive control. The GP is linearized to enable real-time high-frequency solutions. Moreover, to handle the error caused by linearization, we integrate end-to-end Bayesian optimization during sample collection stages to improve the control performance. Experimental results on both simulations and real quadrotors show that we can achieve real-time solvable computation speed with acceptable tracking errors.
Efficient Navigation of a Robotic Fish Swimming Across the Vortical Flow Field
Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-structure interactions (FSI) caused by unsteady hydrodynamics. This study proposes a deep reinforcement learning (DRL) algorithm, trained in a data-driven manner, to enable efficient navigation of a robotic fish swimming across vortical flows. Our proposed algorithm incorporates the LSTM architecture and uses several recent consecutive observations as the state to address the issue of partial observation, often due to sensor limitations. We present a numerical study of navigation within a Karman vortex street, created by placing a stationary cylinder in a uniform flow, utilizing the immersed boundary-lattice Boltzmann method (IB-LBM). The aim is to train the robotic fish to discover efficient navigation policies, enabling it to reach a designated target point across the Karman vortex street from various initial positions. After training, the fish demonstrates the ability to rapidly reach the target from different initial positions, showcasing the effectiveness and robustness of our proposed algorithm. Analysis of the results reveals that the robotic fish can leverage velocity gains and pressure differences induced by the vortices to reach the target, underscoring the potential of our proposed algorithm in enhancing navigation in complex hydrodynamic environments.
comment: We would like to request the withdrawal of our submission due to some misunderstandings among the co-authors concerning the submission process. It appears that the current version was submitted before we reached a consensus among all authors. We are actively working to address these matters and plan to resubmit a revised version once we achieve agreement
CARTOS: A Charging-Aware Real-Time Operating System for Intermittent Batteryless Devices
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of program execution amidst variable energy availability and maintaining reliable real-time time behavior during power disruptions. To address these challenges, CARTOS introduces a mixed-preemption scheduling model that classifies tasks into computational and peripheral tasks, and ensures their efficient and timely execution by adopting just-in-time checkpointing for divisible computation tasks and uninterrupted execution for indivisible peripheral tasks. CARTOS also supports processing chains of tasks with precedence constraints and adapts its scheduling in response to environmental changes to offer continuous execution under diverse conditions. CARTOS is implemented with new APIs and components added to FreeRTOS but is designed for portability to other embedded RTOSs. Through real hardware experiments and simulations, CARTOS exhibits superior performance over state-of-the-art methods, demonstrating that it can serve as a practical platform for developing resilient, real-time sensing applications on IPDs.
Detecting and Mitigating System-Level Anomalies of Vision-Based Controllers
Autonomous systems, such as self-driving cars and drones, have made significant strides in recent years by leveraging visual inputs and machine learning for decision-making and control. Despite their impressive performance, these vision-based controllers can make erroneous predictions when faced with novel or out-of-distribution inputs. Such errors can cascade to catastrophic system failures and compromise system safety. In this work, we introduce a run-time anomaly monitor to detect and mitigate such closed-loop, system-level failures. Specifically, we leverage a reachability-based framework to stress-test the vision-based controller offline and mine its system-level failures. This data is then used to train a classifier that is leveraged online to flag inputs that might cause system breakdowns. The anomaly detector highlights issues that transcend individual modules and pertain to the safety of the overall system. We also design a fallback controller that robustly handles these detected anomalies to preserve system safety. We validate the proposed approach on an autonomous aircraft taxiing system that uses a vision-based controller for taxiing. Our results show the efficacy of the proposed approach in identifying and handling system-level anomalies, outperforming methods such as prediction error-based detection, and ensembling, thereby enhancing the overall safety and robustness of autonomous systems.
Recent progress in the physical principles of dynamic ground self-righting
Animals and robots must self-right on the ground after overturning. Biology research described various strategies and motor patterns in many species. Robotics research devised many strategies. However, we do not well understand how the physical principles of how the need to generate mechanical energy to overcome the potential energy barrier governs behavioral strategies and 3-D body rotations given the morphology. Here I review progress on this which I led studying cockroaches self-righting on level, flat, solid, low-friction ground, by integrating biology experiments, robotic modeling, and physics modeling.
comment: 20 pages, 13 figures
Distributed Model Predictive Control for Piecewise Affine Systems Based on Switching ADMM
This paper presents a novel approach for distributed model predictive control (MPC) for piecewise affine (PWA) systems. Existing approaches rely on solving mixed-integer optimization problems, requiring significant computation power or time. We propose a distributed MPC scheme that requires solving only convex optimization problems. The key contribution is a novel method, based on the alternating direction method of multipliers, for solving the non-convex optimal control problem that arises due to the PWA dynamics. We present a distributed MPC scheme, leveraging this method, that explicitly accounts for the coupling between subsystems by reaching agreement on the values of coupled states. Stability and recursive feasibility are shown under additional assumptions on the underlying system. Two numerical examples are provided, in which the proposed controller is shown to significantly improve the CPU time and closed-loop performance over existing state-of-the-art approaches.
comment: 15 pages, 9 figures, submitted to IEEE Transactions on Automatic Control, code available at https://github.com/SamuelMallick/stable-dmpc-pwa/tree/paper_2024 and https://github.com/SamuelMallick/hybrid-vehicle-platoon/tree/paper-2024
Learning to Boost the Performance of Stable Nonlinear Systems
The growing scale and complexity of safety-critical control systems underscore the need to evolve current control architectures aiming for the unparalleled performances achievable through state-of-the-art optimization and machine learning algorithms. However, maintaining closed-loop stability while boosting the performance of nonlinear control systems using data-driven and deep-learning approaches stands as an important unsolved challenge. In this paper, we tackle the performance-boosting problem with closed-loop stability guarantees. Specifically, we establish a synergy between the Internal Model Control (IMC) principle for nonlinear systems and state-of-the-art unconstrained optimization approaches for learning stable dynamics. Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems; crucially, we guarantee L_p closed-loop stability even if optimization is halted prematurely, and even when the ground-truth dynamics are unknown, with vanishing conservatism in the class of stabilizing policies as the model uncertainty is reduced to zero. We discuss the implementation details of the proposed control schemes, including distributed ones, along with the corresponding optimization procedures, demonstrating the potential of freely shaping the cost functions through several numerical experiments.
TOP-Nav: Legged Navigation Integrating Terrain, Obstacle and Proprioception Estimation
Legged navigation is typically examined within open-world, off-road, and challenging environments. In these scenarios, estimating external disturbances requires a complex synthesis of multi-modal information. This underlines a major limitation in existing works that primarily focus on avoiding obstacles. In this work, we propose TOP-Nav, a novel legged navigation framework that integrates a comprehensive path planner with Terrain awareness, Obstacle avoidance and close-loop Proprioception. TOP-Nav underscores the synergies between vision and proprioception in both path and motion planning. Within the path planner, we present and integrate a terrain estimator that enables the robot to select waypoints on terrains with higher traversability while effectively avoiding obstacles. In the motion planning level, we not only implement a locomotion controller to track the navigation commands, but also construct a proprioception advisor to provide motion evaluations for the path planner. Based on the close-loop motion feedback, we make online corrections for the vision-based terrain and obstacle estimations. Consequently, TOP-Nav achieves open-world navigation that the robot can handle terrains or disturbances beyond the distribution of prior knowledge and overcomes constraints imposed by visual conditions. Building upon extensive experiments conducted in both simulation and real-world environments, TOP-Nav demonstrates superior performance in open-world navigation compared to existing methods.
comment: Published on CoRL 2024
SustainDC -- Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems
As the use of autonomous robots expands in tasks that are complex and challenging to model, the demand for robust data-driven control methods that can certify safety and stability in uncertain conditions is increasing. However, the practical implementation of these methods often faces scalability issues due to the growing amount of data points with system complexity, and a significant reliance on high-quality training data. In response to these challenges, this study presents a scalable data-driven controller that efficiently identifies and infers from the most informative data points for implementing data-driven safety filters. Our approach is grounded in the integration of a model-based certificate function-based method and Gaussian Process (GP) regression, reinforced by a novel online data selection algorithm that reduces time complexity from quadratic to linear relative to dataset size. Empirical evidence, gathered from successful real-world cart-pole swing-up experiments and simulated locomotion of a five-link bipedal robot, demonstrates the efficacy of our approach. Our findings reveal that our efficient online data selection algorithm, which strategically selects key data points, enhances the practicality and efficiency of data-driven certifying filters in complex robotic systems, significantly mitigating scalability concerns inherent in nonparametric learning-based control methods.
comment: The first three authors contributed equally to the work. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
MARec: Metadata Alignment for cold-start Recommendation
For many recommender systems, the primary data source is a historical record of user clicks. The associated click matrix is often very sparse, as the number of users x products can be far larger than the number of clicks. Such sparsity is accentuated in cold-start settings, which makes the efficient use of metadata information of paramount importance. In this work, we propose a simple approach to address cold-start recommendations by leveraging content metadata, Metadata Alignment for cold-start Recommendation. We show that this approach can readily augment existing matrix factorization and autoencoder approaches, enabling a smooth transition to top performing algorithms in warmer set-ups. Our experimental results indicate three separate contributions: first, we show that our proposed framework largely beats SOTA results on 4 cold-start datasets with different sparsity and scale characteristics, with gains ranging from +8.4% to +53.8% on reported ranking metrics; second, we provide an ablation study on the utility of semantic features, and proves the additional gain obtained by leveraging such features ranges between +46.8% and +105.5%; and third, our approach is by construction highly competitive in warm set-ups, and we propose a closed-form solution outperformed by SOTA results by only 0.8% on average.
On Game Based Distributed Decision Approach for Multi-agent Optimal Coverage Problem with Application to Constellations Reconfiguration
This paper focuses on the optimal coverage problem (OCP) for multi-agent systems with decentralized optimization. A game based distributed decision approach for the the multi-agent OCP is proposed. The equivalence between the equilibrium of the game and the extreme value of the global performance objective is strictly proved. Then, a distributed algorithm only using local information to obtain the global near-optimal coverage is developed, and its convergence is proved. Finally, the proposed method is applied to maximize the covering time of a satellite constellation for a target. The simulation results under different scenarios show our method costs much less computation time under some level index than traditional centralized optimization.
comment: 11 pages,11 figures
Personalised Outfit Recommendation via History-aware Transformers
We present the history-aware transformer (HAT), a transformer-based model that uses shoppers' purchase history to personalise outfit predictions. The aim of this work is to recommend outfits that are internally coherent while matching an individual shopper's style and taste. To achieve this, we stack two transformer models, one that produces outfit representations and another one that processes the history of purchased outfits for a given shopper. We use these models to score an outfit's compatibility in the context of a shopper's preferences as inferred from their previous purchases. During training, the model learns to discriminate between purchased and random outfits using 3 losses: the focal loss for outfit compatibility typically used in the literature, a contrastive loss to bring closer learned outfit embeddings from a shopper's history, and an adaptive margin loss to facilitate learning from weak negatives. Together, these losses enable the model to make personalised recommendations based on a shopper's purchase history. Our experiments on the IQON3000 and Polyvore datasets show that HAT outperforms strong baselines on the outfit Compatibility Prediction (CP) and the Fill In The Blank (FITB) tasks. The model improves AUC for the CP hard task by 15.7% (IQON3000) and 19.4% (Polyvore) compared to previous SOTA results. It further improves accuracy on the FITB hard task by 6.5% and 9.7%, respectively. We provide ablation studies on the personalisation, constrastive loss, and adaptive margin loss that highlight the importance of these modelling choices.
Systems and Control (EESS)
SensoPatch: A Reconfigurable Haptic Feedback with High-Density Tactile Sensing Glove
Haptic feedback is integral to the improved experience of prosthetic users and the reduction in prosthesis rejection. Prior studies have explored various methods to encode tactile information and deliver vibration feedback. However, a comprehensive study comparing performance across different stimulation locations and feedback modalities for wearable devices is absent and there is no test platform. This paper proposes an open-source reconfigurable haptic feedback system which incorporates 25 sensors and wireless communication to allow customized number of vibration motors, adjustable motor placement, and programmable encoding of tactile data to change feedback modalities. To demonstrate potential studies that can be investigated using SensoPatch, we conducted two experiments: 1) to assess the vibration discrimination accuracy on 3 body parts 2) to assess the effect of 6 methods of mapping tactile data to varying number of motors on object manipulation. SensoPatch utilizes low-cost off-the-shelf components, enabling large-scale comparative studies of feedback modalities and stimulation sites to optimize vibrotactile feedback and facilitate its deployment in upper limb prostheses.
comment: 5 pages, 5 figures, 1 table, to be published in 2024 IEEE Biomedical Circuits and Systems Conference (BioCAS)
Towards Energy- and Cost-Efficient 6G Networks
As the world enters the journey toward the 6th generation (6G) of wireless technology, the promises of ultra-high data rates, unprecedented low latency, and a massive surge in connected devices require crucial exploration of network energy saving (NES) solutions to minimize the carbon footprint and overall energy usage of future cellular networks. On the other hand, network-controlled repeaters (NCRs) have been introduced by 3rd generation partnership project (3GPP) as a cost-effective solution to improve network coverage. However, their impact on network power consumption and energy efficiency has not been thoroughly investigated. This paper studies NES schemes for next-generation 6G networks aided by NCRs and proposes optimal NES strategies aiming at maximizing the overall energy efficiency of the network. Repeaters are shown to allow for power savings at next-generation nodeB (gNB), and offer higher overall energy efficiency (EE) and spectral efficiency (SE), thus providing an energy-efficient and cost-efficient alternative to increase the performance of future 6G networks
comment: 7 pages, conference
Calibrating microscopic traffic models with macroscopic data
Traffic microsimulation is a crucial tool that uses microscopic traffic models, such as car-following and lane-change models, to simulate the trajectories of individual agents. This digital platform allows for the assessment of the impact of emerging technologies on transportation system performance. While these microscopic models are based on mathematical structures, their parameters must be fitted to real-world data through a process called model calibration. Despite extensive studies on calibration, the focus has predominantly been on fitting microscopic data, such as trajectories, rather than evaluating how well the models reproduce macroscopic traffic patterns, such as congestion, bottlenecks, and traffic waves. In this work, we address this gap by calibrating microscopic traffic flow models using macroscopic (aggregated) data, which is more readily accessible. We designed a SUMO-in-the-loop calibration framework with the goal of replicating observed macroscopic traffic features. To assess calibration accuracy, we developed a set of performance measures that evaluate the models' ability to replicate traffic states across the entire spatiotemporal domain and other qualitative characteristics of traffic flow. The calibration method was applied to both a synthetic scenario and a real-world scenario on a segment of Interstate 24, to demonstrate its effectiveness in reproducing observed traffic patterns.
Improved formulation for long-duration storage in capacity expansion models using representative periods
With the increasing complexity and size of capacity expansion models, temporal aggregation has emerged as a common method to improve computational tractability. However, this approach inherently complicates the inclusion of long-duration storage (LDS) systems, whose operation involves the entire time horizon connecting all time steps. This work presents a detailed investigation of LDS modelling with temporal aggregation. A novel compact formulation is proposed to reduce the number of constraints while effectively tracking the storage content and enforcing limits on the state of charge throughout the entire time horizon. The developed method is compared with two leading state-of-the-art formulations. All three methods are implemented in the Dolphyn capacity expansion model and tested on a case study for the continental United States, considering different configurations in terms of spatial resolutions and representative periods. The performance is assessed with both the commercial solver Gurobi and the open-source solver HiGHS. Results show that the developed compact formulation consistently outperforms the other methods in terms of both runtime (30%-70% faster than other methods) and memory usage (1%-9% lower than other methods).
Joint Optimization of Pattern, Headway, and Fleet Size of Multiple Urban Transit Lines with Perceived Headway Consideration and Passenger Flow Allocation
This study addresses the urban transit pattern design problem, optimizing stop sequences, headways, and fleet sizes across multiple routes simultaneously to minimize user costs (composed of riding, waiting, and transfer times) under operational constraints (e.g., vehicle capacity and fleet size). A destination-labeled multi-commodity network flow (MCNF) formulation is developed to solve the problem at a large scale more efficiently compared to the previous literature. The model allows for flexible pattern options without relying on pre-defined candidate sets and simultaneously considers multiple operational strategies such as express/local services, short-turning, and deadheading. It evaluates perceived headways of joint patterns for passengers, assigns passenger flows to each pattern accordingly, and allows transfers across patterns in different directions. The mixed-integer linear programming (MILP) model is demonstrated with a city-sized network of metro lines in Chicago, USA, achieving near-optimal solutions in hours. The total weighted journey times are reduced by 0.61% and 4.13% under single-route and multi-route scenarios respectively. The model provides transit agencies with an efficient tool for comprehensive service design and resource allocation, improving service quality and resource utilization without additional operational costs.
comment: 23 pages, 3 figures, a previous version accepted for presentation in the 104th Transportation Research Board Annual Meeting in Washington, D.C. in January 2025
Robust Proximity Operations using Probabilistic Markov Models ICRA 2025
A Markov decision process-based state switching is devised, implemented, and analyzed for proximity operations of various autonomous vehicles. The framework contains a pose estimator along with a multi-state guidance algorithm. The unified pose estimator leverages the extended Kalman filter for the fusion of measurements from rate gyroscopes, monocular vision, and ultra-wideband radar sensors. It is also equipped with Mahalonobis distance-based outlier rejection and under-weighting of measurements for robust performance. The use of probabilistic Markov models to transition between various guidance modes is proposed to enable robust and efficient proximity operations. Finally, the framework is validated through an experimental analysis of the docking of two small satellites and the precision landing of an aerial vehicle.
comment: This work has been submitted to the IEEE ICRA 2025 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Accompanying video : https://youtu.be/8-fetyf_SrM. arXiv admin note: text overlap with arXiv:2409.09665
Robust Deep Reinforcement Learning for Volt-VAR Optimization in Active Distribution System under Uncertainty
The deep reinforcement learning (DRL) based Volt-VAR optimization (VVO) methods have been widely studied for active distribution networks (ADNs). However, most of them lack safety guarantees in terms of power injection uncertainties due to the increase in distributed energy resources (DERs) and load demand, such as electric vehicles. This article proposes a robust deep reinforcement learning (RDRL) framework for VVO via a robust deep deterministic policy gradient (DDPG) algorithm. This algorithm can effectively manage hybrid action spaces, considering control devices like capacitors, voltage regulators, and smart inverters. Additionally, it is designed to handle uncertainties by quantifying uncertainty sets with conformal prediction and modeling uncertainties as adversarial attacks to guarantee safe exploration across action spaces. Numerical results on three IEEE test cases demonstrate the sample efficiency and safety of the proposed robust DDPG against uncertainties compared to the benchmark algorithms.
Robust and efficient data-driven predictive control
We propose a robust and efficient data-driven predictive control (eDDPC) scheme which is more sample efficient (requires less offline data) compared to existing schemes, and is also computationally efficient. This is done by leveraging an alternative data-based representation of the trajectories of linear time-invariant (LTI) systems. The proposed scheme relies only on using (short and potentially irregularly measured) noisy input-output data, the amount of which is independent of the prediction horizon. To account for measurement noise, we provide a novel result that quantifies the uncertainty between the true (unknown) restricted behavior of the system and the estimated one from noisy data. Furthermore, we show that the robust eDDPC scheme is recursively feasible and that the resulting closed-loop system is practically stable. Finally, we compare the performance of this scheme to existing ones on a case study of a four tank system.
comment: 17 pages, 2 figures, submitted for Automatica
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions ICRA 2025
We address the challenge of safe control in decentralized multi-agent robotic settings, where agents use uncertain black-box models to predict other agents' trajectories. We use the recently proposed conformal decision theory to adapt the restrictiveness of control barrier functions-based safety constraints based on observed prediction errors. We use these constraints to synthesize controllers that balance between the objectives of safety and task accomplishment, despite the prediction errors. We provide an upper bound on the average over time of the value of a monotonic function of the difference between the safety constraint based on the predicted trajectories and the constraint based on the ground truth ones. We validate our theory through experimental results showing the performance of our controllers when navigating a robot in the multi-agent scenes in the Stanford Drone Dataset.
comment: 6 pages, 1 figure, submitted for ICRA 2025
Path Following Model Predictive Control of a Coupled Autonomous Underwater Vehicle
The operation of an autonomous underwater vehicle (AUV) faces challenges in following predetermined waypoints due to coupled motions under environmental disturbances. To address this, a 3D path following guidance and control system is developed in this work based on the line-of-sight (LOS) guidance method. Conventionally, the 3D path following problem is transformed into heading and depth control problems, assuming that the motion of the vehicle is decoupled in horizontal and depth coordinates. The proposed control system design avoids this simplifying assumption by transforming the problem into a 3D position and orientation tracking problem. This design is achieved by computing a 2D horizontal coordinate based on the desired heading and then computing a corresponding LOS depth coordinate. A model predictive controller (MPC) is then implemented using the 3D LOS coordinate and the computed orientation vector. The MPC obtains a robust control by solving a minimax optimisation problem considering the effects of unknown ocean disturbances. The effectiveness of the proposed guidance and control system is demonstrated through the simulation of a prototype AUV system. Numerical results show that the AUV can follow predetermined waypoints in the presence of time-varying disturbances, and the system is steered at a constant surge speed that is proportional to the radius of the circle of acceptance used to implement the guidance system.
comment: 6 pages, 4 figures, Presented at the IFAC CAMS 2024, Virginia, USA
Hierarchical Federated ADMM
In this paper, we depart from the widely-used gradient descent-based hierarchical federated learning (FL) algorithms to develop a novel hierarchical FL framework based on the alternating direction method of multipliers (ADMM). Within this framework, we propose two novel FL algorithms, which both use ADMM in the top layer: one that employs ADMM in the lower layer and another that uses the conventional gradient descent-based approach. The proposed framework enhances privacy, and experiments demonstrate the superiority of the proposed algorithms compared to the conventional algorithms in terms of learning convergence and accuracy. Additionally, gradient descent on the lower layer performs well even if the number of local steps is very limited, while ADMM on both layers lead to better performance otherwise.
Asymptotic tracking control of dynamic reference over homomorphically encrypted data with finite modulus
This paper considers a tracking control problem, in which the dynamic controller is encrypted with an additively homomorphic encryption scheme and the output of a process tracks a dynamic reference asymptotically. Our paper is motivated by the following problem: When dealing with both asymptotic tracking and dynamic reference, we find that the control input is generally subject to overflow issues under a finite modulus, though the dynamic controller consists of only integer coefficients. First, we provide a new controller design method such that the coefficients of the tracking controller can be transformed into integers leveraging the zooming-in factor of dynamic quantization. By the Cayley-Hamilton theorem, we represent the control input as linear combination of the previous control inputs. Leveraging the property above, we design an algorithm on the actuator side such that it can restore the control input from the lower bits under a finite modulus. A lower bound of the modulus is also provided. As an extension of the first result, we further solve the problem of unbounded internal state taking place in the actuator. In particular, the actuator can restore the correct control input under the same modulus. A simulation example is provided to verify the control schemes proposed in our paper.
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning the parameters of a dynamical system model. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a novel neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Networks (ESNs) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
comment: 21 pages, 9 figures
Dual Pricing to Prioritize Renewable Energy and Consumer Preferences in Electricity Markets
Electricity markets currently fail to incorporate preferences of buyers, treating polluting and renewable energy sources as having equal social benefit under a system of uniform clearing prices. Meanwhile, renewable energy is prone to curtailment due to transmission constraints, forcing grid operators to reduce or shut down renewable energy production despite its availability and need. This paper proposes a ``dual pricing mechanism" which allows buyers to bid both their willingness to pay for electricity, and additionally, their preference for green energy. Designed for use in deregulated electricity markets, this mechanism prioritizes the dispatch of more renewable energy sources according to consumer preferences. Traditional uniform clearing prices, which treat all energy sources equally, do not reflect the growing share of green energy in the power grid and the environmental values of consumers. By allowing load-serving entities to bid their willingness to pay for renewable energy directly into the clearing market, our proposed framework generates distinct pricing signals for green and ``black" electricity.
Transparency evaluation for the Kinematic Design of the Harnesses through Human-Exoskeleton Interaction Modeling
Lower Limb Exoskeletons (LLEs) are wearable robots that provide mechanical power to the user. Human-exoskeleton (HE) connections must preserve the user's natural behavior during the interaction, avoiding undesired forces. Therefore, numerous works focus on their minimization. Given the inherent complications of repeatedly prototyping and experimentally testing a device, modeling the exoskeleton and its physical interaction with the user emerges as a valuable approach for assessing the design effects. This paper proposes a novel method to compare different exoskeleton configurations with a flexible simulation tool. This approach contemplates simulating the dynamics of the device, including its interaction with the wearer, to evaluate multiple connection mechanism designs along with the kinematics and actuation of the LLE. This evaluation is based on the minimization of the interaction wrenches through an optimization process that includes the impedance parameters at the interfaces as optimization variables and the similarity of the LLE's joint variables trajectories with the motion of the wearer's articulations. Exploratory tests are conducted using the Wearable Walker LLE in different configurations and measuring the interaction forces. Experimental data are then compared to the optimization outcomes, proving that the proposed method provides contact wrench estimations consistent with the collected measurements and previous outcomes from the literature. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
A History-Guided Regional Partitioning Evolutionary Optimization for Solving the Flexible Job Shop Problem with Limited Multi-load Automated Guided Vehicles
In a flexible job shop environment, using Automated Guided Vehicles (AGVs) to transport jobs and process materials is an important way to promote the intelligence of the workshop. Compared with single-load AGVs, multi-load AGVs can improve AGV utilization, reduce path conflicts, etc. Therefore, this study proposes a history-guided regional partitioning algorithm (HRPEO) for the flexible job shop scheduling problem with limited multi-load AGVs (FJSPMA). First, the encoding and decoding rules are designed according to the characteristics of multi-load AGVs, and then the initialization rule based on the branch and bound method is used to generate the initial population. Second, to prevent the algorithm from falling into a local optimum, the algorithm adopts a regional partitioning strategy. This strategy divides the solution space into multiple regions and measures the potential of the regions. After that, cluster the regions into multiple clusters in each iteration, and selects individuals for evolutionary search based on the set of clusters. Third, a local search strategy is designed to improve the exploitation ability of the algorithm, which uses a greedy approach to optimize machines selection and transportation sequence according to the characteristics of FJSPMA. Finally, a large number of experiments are carried out on the benchmarks to test the performance of the algorithm. Compared with multiple advanced algorithms, the results show that the HRPEO has a better advantage in solving FJSPMA.
comment: 14 pages
On Adaptive Frequency Sampling for Data-driven MOR Applied to Antenna Responses
Frequency domain sweeps of array antennas are well-known to be time-intensive, and different surrogate models have been used to improve the performance. Data-driven model order reduction algorithms, such as the Loewner framework and vector fitting, can be integrated with these adaptive error estimates, in an iterative algorithm, to reduce the number of full-wave simulations required to accurately capture the requested frequency behavior of multiport array antennas. In this work, we propose two novel adaptive methods exploiting a block matrix function which is a key part of the Loewner framework generating system approach. The first algorithm leverages an inherent matrix parameter freedom in the block matrix function to identify frequency points with large errors, whereas the second utilizes the condition number of the block matrix function. Both methods effectively provide frequency domain error estimates, essential for improved performance. Numerical experiments on multiport array antenna S-parameters demonstrate the effectiveness of our proposed algorithms within the Loewner framework.
comment: 10 pages, 12 figures
Pseudometrics for scalable data-driven comparisons of nonlinear dynamical systems
Novel solutions for pseudometrics quantifying deviation from topological conjugacy between dynamical systems are presented. Deviation from conjugacy is quantified in a Pareto optimal sense that accounts for spectral properties of Koopman operators as well as trajectory geometry. Theoretical justification is provided for computing such pseudometrics in Koopman eigenfunction space rather than observable space. Furthermore, it is shown deriving the pseudometrics from unitary transformations is sufficient to recover a value of zero if two systems are topologically conjugate. Therefore the pseudometrics for quantifying deviation from conjugacy are based on analytical solutions for unitary transformations in Koopman eigenfunction space. Finally, geometric considerations for the Pareto optimality problem associated with deviation from conjugacy are used to develop pseudometrics that account for all possible solutions given just two Pareto points based on analytical solutions.
Impact of number of elements on the directivity of planar array of monopole antenna
This research investigates how the number of elements affects the monopole antenna's planar array's directivity. This study also takes into account the antenna's effect on the whole field it radiates. The monopole antennas are arranged in a planar configuration with all the components in their proper locations using the Hadamard matrix approach. Each matrix's directivities and array factors were calculated, and a MATLAB tool was used to simulate the radiation pattern. A range of elements from 4 X 4 to 50 X 50 planar layouts were taken into consideration during the investigation. Increasing the number of elements improves the directivity. Increasing the number of elements in the planar array resulted in a great improvement in directivity, as seen by the computed and simulated results. Consequently, by increasing the antenna's directivity, a greater number of elements influences the overall field emitted.
comment: 8 pages, 19 Figures, article
Pseudo-kinematic trajectory control of tracked vehicles
Tracked vehicles are used in complex scenarios, where motion planning and navigation can be very complex. They have complex dynamics, with many parameters that are difficult to identify and that change significantly based on the operating conditions. We propose a simple pseudo-kinematic model, where the intricate dynamic effects underlying the vehicle's motion are captured in a small set of velocity-dependent parameters. This choice enables the development of a Lyapunov-based trajectory controller with guaranteed performance and small computation time. We demonstrate the correctness of our approach with both simulation and experimental data.
Towards Event-Triggered NMPC for Efficient 6G Communications: Experimental Results and Open Problems
Networked control systems enable real-time control and coordination of distributed systems, leveraging the low latency, high reliability, and massive connectivity offered by 5G and future 6G networks. Applications include autonomous vehicles, robotics, industrial automation, and smart grids. Despite networked control algorithms admitting nominal stability guarantees even in the presence of delays and packet dropouts, their practical performance still heavily depends on the specific characteristics and conditions of the underlying network. To achieve the desired performance while efficiently using communication resources, co-design of control and communication is pivotal. Although periodic schemes, where communication instances are fixed, can provide reliable control performance, unnecessary transmissions, when updates are not needed, result in inefficient usage of network resources. In this paper, we investigate the potential for co-design of model predictive control and network communication. To this end, we design and implement an event-triggered nonlinear model predictive controller for stabilizing a Furuta pendulum communicating over a tailored open radio access network 6G research platform. We analyze the control performance as well as network utilization under varying channel conditions and event-triggering criteria. Our results show that the event-triggered control scheme achieves similar performance to periodic control with reduced communication demand.
Analysis of Truncated Singular Value Decomposition for Koopman Operator-Based Lane Change Model
Understanding and modeling complex dynamic systems is crucial for enhancing vehicle performance and safety, especially in the context of autonomous driving. Recently, popular methods such as Koopman operators and their approximators, known as Extended Dynamic Mode Decomposition (EDMD), have emerged for their effectiveness in transforming strongly nonlinear system behavior into linear representations. This allows them to be integrated with conventional linear controllers. To achieve this, Singular Value Decomposition (SVD), specifically truncated SVD, is employed to approximate Koopman operators from extensive datasets efficiently. This study evaluates different basis functions used in EDMD and ranks for truncated SVD for representing lane change behavior models, aiming to balance computational efficiency with information loss. The findings, however, suggest that the technique of truncated SVD does not necessarily achieve substantial reductions in computational training time and results in significant information loss.
comment: Submitted to the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024)
Unscented Transform-based Pure Pursuit Path-Tracking Algorithm under Uncertainty
Automated driving has become more and more popular due to its potential to eliminate road accidents by taking over driving tasks from humans. One of the remaining challenges is to follow a planned path autonomously, especially when uncertainties in self-localizing or understanding the surroundings can influence the decisions made by autonomous vehicles, such as calculating how much they need to steer to minimize tracking errors. In this paper, a modified geometric pure pursuit path-tracking algorithm is proposed, taking into consideration such uncertainties using the unscented transform. The algorithm is tested through simulations for typical road geometries, such as straight and circular lines.
comment: Submitted to the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024)
CaΣoS: A nonlinear sum-of-squares optimization suite
We present Ca{\Sigma}oS, the first MATLAB software specifically designed for nonlinear sum-of-squares optimization. A symbolic polynomial algebra system allows to formulate parametrized sum-of-squares optimization problems and facilitates their fast, repeated evaluations. To that extent, we make use of CasADi's symbolic framework and realize concepts of monomial sparsity, linear operators (including duals), and functions between polynomials. Ca{\Sigma}oS currently provides interfaces to the conic solvers SeDuMi, Mosek, and SCS as well as methods to solve quasiconvex optimization problems (via bisection) and nonconvex optimization problems (via sequential convexification). Numerical examples for benchmark problems including region-of-attraction and reachable set estimation for nonlinear dynamic systems demonstrate significant improvements in computation time compared to existing toolboxes.. Ca{\Sigma}oS is available open-source at https://github.com/ ifr-acso/casos.
comment: Submitted to 2025 American Control Conference
Adaptive Knowledge-based Multi-Objective Evolutionary Algorithm for Hybrid Flow Shop Scheduling Problems with Multiple Parallel Batch Processing Stages
Parallel batch processing machines have extensive applications in the semiconductor manufacturing process. However, the problem models in previous studies regard parallel batch processing as a fixed processing stage in the machining process. This study generalizes the problem model, in which users can arbitrarily set certain stages as parallel batch processing stages according to their needs. A Hybrid Flow Shop Scheduling Problem with Parallel Batch Processing Machines (PBHFSP) is solved in this paper. Furthermore, an Adaptive Knowledge-based Multi-Objective Evolutionary Algorithm (AMOEA/D) is designed to simultaneously optimize both makespan and Total Energy Consumption (TEC). Firstly, a hybrid initialization strategy with heuristic rules based on knowledge of PBHFSP is proposed to generate promising solutions. Secondly, the disjunctive graph model has been established based on the knowledge to find the critical-path of PBHFS. Then, a critical-path based neighborhood search is proposed to enhance the exploitation ability of AMOEA/D. Moreover, the search time is adaptively adjusted based on learning experience from Q-learning and Decay Law. Afterward, to enhance the exploration capability of the algorithm, AMOEA/D designs an improved population updating strategy with a weight vector updating strategy. These strategies rematch individuals with weight vectors, thereby maintaining the diversity of the population. Finally, the proposed algorithm is compared with state-of-the-art algorithms. The experimental results show that the AMOEA/D is superior to the comparison algorithms in solving the PBHFSP.
comment: 12 pages
CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models ICRA 2025
Curriculum learning is a training mechanism in reinforcement learning (RL) that facilitates the achievement of complex policies by progressively increasing the task difficulty during training. However, designing effective curricula for a specific task often requires extensive domain knowledge and human intervention, which limits its applicability across various domains. Our core idea is that large language models (LLMs), with their extensive training on diverse language data and ability to encapsulate world knowledge, present significant potential for efficiently breaking down tasks and decomposing skills across various robotics environments. Additionally, the demonstrated success of LLMs in translating natural language into executable code for RL agents strengthens their role in generating task curricula. In this work, we propose CurricuLLM, which leverages the high-level planning and programming capabilities of LLMs for curriculum design, thereby enhancing the efficient learning of complex target tasks. CurricuLLM consists of: (Step 1) Generating sequence of subtasks that aid target task learning in natural language form, (Step 2) Translating natural language description of subtasks in executable task code, including the reward code and goal distribution code, and (Step 3) Evaluating trained policies based on trajectory rollout and subtask description. We evaluate CurricuLLM in various robotics simulation environments, ranging from manipulation, navigation, and locomotion, to show that CurricuLLM can aid learning complex robot control tasks. In addition, we validate humanoid locomotion policy learned through CurricuLLM in real-world. The code is provided in https://github.com/labicon/CurricuLLM
comment: Submitted to ICRA 2025
Diffusion Models for Intelligent Transportation Systems: A Survey
Intelligent Transportation Systems (ITS) are vital in modern traffic management and optimization, significantly enhancing traffic efficiency and safety. Recently, diffusion models have emerged as transformative tools for addressing complex challenges within ITS. In this paper, we present a comprehensive survey of diffusion models for ITS, covering both theoretical and practical aspects. First, we introduce the theoretical foundations of diffusion models and their key variants, including conditional diffusion models and latent diffusion models, highlighting their suitability for modeling complex, multi-modal traffic data and enabling controllable generation. Second, we outline the primary challenges in ITS and the corresponding advantages of diffusion models, providing readers with a deeper understanding of the intersection between ITS and diffusion models. Third, we offer a multi-perspective investigation of current applications of diffusion models in ITS domains, including autonomous driving, traffic simulation, trajectory prediction, and traffic safety. Finally, we discuss state-of-the-art diffusion model techniques and highlight key ITS research directions that warrant further investigation. Through this structured overview, we aim to provide researchers with a comprehensive understanding of diffusion models for ITS, thereby advancing their future applications in the transportation domain.
comment: 7 figures
Enabling On-Chip High-Frequency Adaptive Linear Optimal Control via Linearized Gaussian Process
Unpredictable and complex aerodynamic effects pose significant challenges to achieving precise flight control, such as the downwash effect from upper vehicles to lower ones. Conventional methods often struggle to accurately model these interactions, leading to controllers that require large safety margins between vehicles. Moreover, the controller on real drones usually requires high-frequency and has limited on-chip computation, making the adaptive control design more difficult to implement. To address these challenges, we incorporate Gaussian process (GP) to model the adaptive external aerodynamics with linear model predictive control. The GP is linearized to enable real-time high-frequency solutions. Moreover, to handle the error caused by linearization, we integrate end-to-end Bayesian optimization during sample collection stages to improve the control performance. Experimental results on both simulations and real quadrotors show that we can achieve real-time solvable computation speed with acceptable tracking errors.
Efficient Navigation of a Robotic Fish Swimming Across the Vortical Flow Field
Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-structure interactions (FSI) caused by unsteady hydrodynamics. This study proposes a deep reinforcement learning (DRL) algorithm, trained in a data-driven manner, to enable efficient navigation of a robotic fish swimming across vortical flows. Our proposed algorithm incorporates the LSTM architecture and uses several recent consecutive observations as the state to address the issue of partial observation, often due to sensor limitations. We present a numerical study of navigation within a Karman vortex street, created by placing a stationary cylinder in a uniform flow, utilizing the immersed boundary-lattice Boltzmann method (IB-LBM). The aim is to train the robotic fish to discover efficient navigation policies, enabling it to reach a designated target point across the Karman vortex street from various initial positions. After training, the fish demonstrates the ability to rapidly reach the target from different initial positions, showcasing the effectiveness and robustness of our proposed algorithm. Analysis of the results reveals that the robotic fish can leverage velocity gains and pressure differences induced by the vortices to reach the target, underscoring the potential of our proposed algorithm in enhancing navigation in complex hydrodynamic environments.
comment: We would like to request the withdrawal of our submission due to some misunderstandings among the co-authors concerning the submission process. It appears that the current version was submitted before we reached a consensus among all authors. We are actively working to address these matters and plan to resubmit a revised version once we achieve agreement
CARTOS: A Charging-Aware Real-Time Operating System for Intermittent Batteryless Devices
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of program execution amidst variable energy availability and maintaining reliable real-time time behavior during power disruptions. To address these challenges, CARTOS introduces a mixed-preemption scheduling model that classifies tasks into computational and peripheral tasks, and ensures their efficient and timely execution by adopting just-in-time checkpointing for divisible computation tasks and uninterrupted execution for indivisible peripheral tasks. CARTOS also supports processing chains of tasks with precedence constraints and adapts its scheduling in response to environmental changes to offer continuous execution under diverse conditions. CARTOS is implemented with new APIs and components added to FreeRTOS but is designed for portability to other embedded RTOSs. Through real hardware experiments and simulations, CARTOS exhibits superior performance over state-of-the-art methods, demonstrating that it can serve as a practical platform for developing resilient, real-time sensing applications on IPDs.
Detecting and Mitigating System-Level Anomalies of Vision-Based Controllers
Autonomous systems, such as self-driving cars and drones, have made significant strides in recent years by leveraging visual inputs and machine learning for decision-making and control. Despite their impressive performance, these vision-based controllers can make erroneous predictions when faced with novel or out-of-distribution inputs. Such errors can cascade to catastrophic system failures and compromise system safety. In this work, we introduce a run-time anomaly monitor to detect and mitigate such closed-loop, system-level failures. Specifically, we leverage a reachability-based framework to stress-test the vision-based controller offline and mine its system-level failures. This data is then used to train a classifier that is leveraged online to flag inputs that might cause system breakdowns. The anomaly detector highlights issues that transcend individual modules and pertain to the safety of the overall system. We also design a fallback controller that robustly handles these detected anomalies to preserve system safety. We validate the proposed approach on an autonomous aircraft taxiing system that uses a vision-based controller for taxiing. Our results show the efficacy of the proposed approach in identifying and handling system-level anomalies, outperforming methods such as prediction error-based detection, and ensembling, thereby enhancing the overall safety and robustness of autonomous systems.
Recent progress in the physical principles of dynamic ground self-righting
Animals and robots must self-right on the ground after overturning. Biology research described various strategies and motor patterns in many species. Robotics research devised many strategies. However, we do not well understand how the physical principles of how the need to generate mechanical energy to overcome the potential energy barrier governs behavioral strategies and 3-D body rotations given the morphology. Here I review progress on this which I led studying cockroaches self-righting on level, flat, solid, low-friction ground, by integrating biology experiments, robotic modeling, and physics modeling.
comment: 20 pages, 13 figures
Distributed Model Predictive Control for Piecewise Affine Systems Based on Switching ADMM
This paper presents a novel approach for distributed model predictive control (MPC) for piecewise affine (PWA) systems. Existing approaches rely on solving mixed-integer optimization problems, requiring significant computation power or time. We propose a distributed MPC scheme that requires solving only convex optimization problems. The key contribution is a novel method, based on the alternating direction method of multipliers, for solving the non-convex optimal control problem that arises due to the PWA dynamics. We present a distributed MPC scheme, leveraging this method, that explicitly accounts for the coupling between subsystems by reaching agreement on the values of coupled states. Stability and recursive feasibility are shown under additional assumptions on the underlying system. Two numerical examples are provided, in which the proposed controller is shown to significantly improve the CPU time and closed-loop performance over existing state-of-the-art approaches.
comment: 15 pages, 9 figures, submitted to IEEE Transactions on Automatic Control, code available at https://github.com/SamuelMallick/stable-dmpc-pwa/tree/paper_2024 and https://github.com/SamuelMallick/hybrid-vehicle-platoon/tree/paper-2024
Learning to Boost the Performance of Stable Nonlinear Systems
The growing scale and complexity of safety-critical control systems underscore the need to evolve current control architectures aiming for the unparalleled performances achievable through state-of-the-art optimization and machine learning algorithms. However, maintaining closed-loop stability while boosting the performance of nonlinear control systems using data-driven and deep-learning approaches stands as an important unsolved challenge. In this paper, we tackle the performance-boosting problem with closed-loop stability guarantees. Specifically, we establish a synergy between the Internal Model Control (IMC) principle for nonlinear systems and state-of-the-art unconstrained optimization approaches for learning stable dynamics. Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems; crucially, we guarantee L_p closed-loop stability even if optimization is halted prematurely, and even when the ground-truth dynamics are unknown, with vanishing conservatism in the class of stabilizing policies as the model uncertainty is reduced to zero. We discuss the implementation details of the proposed control schemes, including distributed ones, along with the corresponding optimization procedures, demonstrating the potential of freely shaping the cost functions through several numerical experiments.
TOP-Nav: Legged Navigation Integrating Terrain, Obstacle and Proprioception Estimation
Legged navigation is typically examined within open-world, off-road, and challenging environments. In these scenarios, estimating external disturbances requires a complex synthesis of multi-modal information. This underlines a major limitation in existing works that primarily focus on avoiding obstacles. In this work, we propose TOP-Nav, a novel legged navigation framework that integrates a comprehensive path planner with Terrain awareness, Obstacle avoidance and close-loop Proprioception. TOP-Nav underscores the synergies between vision and proprioception in both path and motion planning. Within the path planner, we present and integrate a terrain estimator that enables the robot to select waypoints on terrains with higher traversability while effectively avoiding obstacles. In the motion planning level, we not only implement a locomotion controller to track the navigation commands, but also construct a proprioception advisor to provide motion evaluations for the path planner. Based on the close-loop motion feedback, we make online corrections for the vision-based terrain and obstacle estimations. Consequently, TOP-Nav achieves open-world navigation that the robot can handle terrains or disturbances beyond the distribution of prior knowledge and overcomes constraints imposed by visual conditions. Building upon extensive experiments conducted in both simulation and real-world environments, TOP-Nav demonstrates superior performance in open-world navigation compared to existing methods.
comment: Published on CoRL 2024
SustainDC -- Benchmarking for Sustainable Data Center Control NeurIPS 2024
Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
comment: Under review at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)
Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems
As the use of autonomous robots expands in tasks that are complex and challenging to model, the demand for robust data-driven control methods that can certify safety and stability in uncertain conditions is increasing. However, the practical implementation of these methods often faces scalability issues due to the growing amount of data points with system complexity, and a significant reliance on high-quality training data. In response to these challenges, this study presents a scalable data-driven controller that efficiently identifies and infers from the most informative data points for implementing data-driven safety filters. Our approach is grounded in the integration of a model-based certificate function-based method and Gaussian Process (GP) regression, reinforced by a novel online data selection algorithm that reduces time complexity from quadratic to linear relative to dataset size. Empirical evidence, gathered from successful real-world cart-pole swing-up experiments and simulated locomotion of a five-link bipedal robot, demonstrates the efficacy of our approach. Our findings reveal that our efficient online data selection algorithm, which strategically selects key data points, enhances the practicality and efficiency of data-driven certifying filters in complex robotic systems, significantly mitigating scalability concerns inherent in nonparametric learning-based control methods.
comment: The first three authors contributed equally to the work. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
MARec: Metadata Alignment for cold-start Recommendation
For many recommender systems, the primary data source is a historical record of user clicks. The associated click matrix is often very sparse, as the number of users x products can be far larger than the number of clicks. Such sparsity is accentuated in cold-start settings, which makes the efficient use of metadata information of paramount importance. In this work, we propose a simple approach to address cold-start recommendations by leveraging content metadata, Metadata Alignment for cold-start Recommendation. We show that this approach can readily augment existing matrix factorization and autoencoder approaches, enabling a smooth transition to top performing algorithms in warmer set-ups. Our experimental results indicate three separate contributions: first, we show that our proposed framework largely beats SOTA results on 4 cold-start datasets with different sparsity and scale characteristics, with gains ranging from +8.4% to +53.8% on reported ranking metrics; second, we provide an ablation study on the utility of semantic features, and proves the additional gain obtained by leveraging such features ranges between +46.8% and +105.5%; and third, our approach is by construction highly competitive in warm set-ups, and we propose a closed-form solution outperformed by SOTA results by only 0.8% on average.
On Game Based Distributed Decision Approach for Multi-agent Optimal Coverage Problem with Application to Constellations Reconfiguration
This paper focuses on the optimal coverage problem (OCP) for multi-agent systems with decentralized optimization. A game based distributed decision approach for the the multi-agent OCP is proposed. The equivalence between the equilibrium of the game and the extreme value of the global performance objective is strictly proved. Then, a distributed algorithm only using local information to obtain the global near-optimal coverage is developed, and its convergence is proved. Finally, the proposed method is applied to maximize the covering time of a satellite constellation for a target. The simulation results under different scenarios show our method costs much less computation time under some level index than traditional centralized optimization.
comment: 11 pages,11 figures
Personalised Outfit Recommendation via History-aware Transformers
We present the history-aware transformer (HAT), a transformer-based model that uses shoppers' purchase history to personalise outfit predictions. The aim of this work is to recommend outfits that are internally coherent while matching an individual shopper's style and taste. To achieve this, we stack two transformer models, one that produces outfit representations and another one that processes the history of purchased outfits for a given shopper. We use these models to score an outfit's compatibility in the context of a shopper's preferences as inferred from their previous purchases. During training, the model learns to discriminate between purchased and random outfits using 3 losses: the focal loss for outfit compatibility typically used in the literature, a contrastive loss to bring closer learned outfit embeddings from a shopper's history, and an adaptive margin loss to facilitate learning from weak negatives. Together, these losses enable the model to make personalised recommendations based on a shopper's purchase history. Our experiments on the IQON3000 and Polyvore datasets show that HAT outperforms strong baselines on the outfit Compatibility Prediction (CP) and the Fill In The Blank (FITB) tasks. The model improves AUC for the CP hard task by 15.7% (IQON3000) and 19.4% (Polyvore) compared to previous SOTA results. It further improves accuracy on the FITB hard task by 6.5% and 9.7%, respectively. We provide ablation studies on the personalisation, constrastive loss, and adaptive margin loss that highlight the importance of these modelling choices.
Robotics
RT-GuIDE: Real-Time Gaussian splatting for Information-Driven Exploration ICRA2025
We propose a framework for active mapping and exploration that leverages Gaussian splatting for constructing information-rich maps. Further, we develop a parallelized motion planning algorithm that can exploit the Gaussian map for real-time navigation. The Gaussian map constructed onboard the robot is optimized for both photometric and geometric quality while enabling real-time situational awareness for autonomy. We show through simulation experiments that our method is competitive with approaches that use alternate information gain metrics, while being orders of magnitude faster to compute. In real-world experiments, our algorithm achieves better map quality (10% higher Peak Signal-to-Noise Ratio (PSNR) and 30% higher geometric reconstruction accuracy) than Gaussian maps constructed by traditional exploration baselines. Experiment videos and more details can be found on our project page: https://tyuezhan.github.io/RT_GuIDE/
comment: Submitted to ICRA2025
Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction
Humans can learn to manipulate new objects by simply watching others; providing robots with the ability to learn from such demonstrations would enable a natural interface specifying new behaviors. This work develops Robot See Robot Do (RSRD), a method for imitating articulated object manipulation from a single monocular RGB human demonstration given a single static multi-view object scan. We first propose 4D Differentiable Part Models (4D-DPM), a method for recovering 3D part motion from a monocular video with differentiable rendering. This analysis-by-synthesis approach uses part-centric feature fields in an iterative optimization which enables the use of geometric regularizers to recover 3D motions from only a single video. Given this 4D reconstruction, the robot replicates object trajectories by planning bimanual arm motions that induce the demonstrated object part motion. By representing demonstrations as part-centric trajectories, RSRD focuses on replicating the demonstration's intended behavior while considering the robot's own morphological limits, rather than attempting to reproduce the hand's motion. We evaluate 4D-DPM's 3D tracking accuracy on ground truth annotated 3D part trajectories and RSRD's physical execution performance on 9 objects across 10 trials each on a bimanual YuMi robot. Each phase of RSRD achieves an average of 87% success rate, for a total end-to-end success rate of 60% across 90 trials. Notably, this is accomplished using only feature fields distilled from large pretrained vision models -- without any task-specific training, fine-tuning, dataset collection, or annotation. Project page: https://robot-see-robot-do.github.io
comment: CoRL 2024, Project page: https://robot-see-robot-do.github.io
EvMAPPER: High Altitude Orthomapping with Event Cameras
Traditionally, unmanned aerial vehicles (UAVs) rely on CMOS-based cameras to collect images about the world below. One of the most successful applications of UAVs is to generate orthomosaics or orthomaps, in which a series of images are integrated together to develop a larger map. However, the use of CMOS-based cameras with global or rolling shutters mean that orthomaps are vulnerable to challenging light conditions, motion blur, and high-speed motion of independently moving objects under the camera. Event cameras are less sensitive to these issues, as their pixels are able to trigger asynchronously on brightness changes. This work introduces the first orthomosaic approach using event cameras. In contrast to existing methods relying only on CMOS cameras, our approach enables map generation even in challenging light conditions, including direct sunlight and after sunset.
comment: 7 pages, 7 figures
Language-Embedded Gaussian Splats (LEGS): Incrementally Building Room-Scale Representations with a Mobile Robot
Building semantic 3D maps is valuable for searching for objects of interest in offices, warehouses, stores, and homes. We present a mapping system that incrementally builds a Language-Embedded Gaussian Splat (LEGS): a detailed 3D scene representation that encodes both appearance and semantics in a unified representation. LEGS is trained online as a robot traverses its environment to enable localization of open-vocabulary object queries. We evaluate LEGS on 4 room-scale scenes where we query for objects in the scene to assess how LEGS can capture semantic meaning. We compare LEGS to LERF and find that while both systems have comparable object query success rates, LEGS trains over 3.5x faster than LERF. Results suggest that a multi-camera setup and incremental bundle adjustment can boost visual reconstruction quality in constrained robot trajectories, and suggest LEGS can localize open-vocabulary and long-tail object queries with up to 66% accuracy.
StackGen: Generating Stable Structures from Silhouettes via Diffusion
Humans naturally obtain intuition about the interactions between and the stability of rigid objects by observing and interacting with the world. It is this intuition that governs the way in which we regularly configure objects in our environment, allowing us to build complex structures from simple, everyday objects. Robotic agents, on the other hand, traditionally require an explicit model of the world that includes the detailed geometry of each object and an analytical model of the environment dynamics, which are difficult to scale and preclude generalization. Instead, robots would benefit from an awareness of intuitive physics that enables them to similarly reason over the stable interaction of objects in their environment. Towards that goal, we propose StackGen, a diffusion model that generates diverse stable configurations of building blocks matching a target silhouette. To demonstrate the capability of the method, we evaluate it in a simulated environment and deploy it in the real setting using a robotic arm to assemble structures generated by the model.
A Sim-to-Real Vision-based Lane Keeping System for a 1:10-scale Autonomous Vehicle
In recent years, several competitions have highlighted the need to investigate vision-based solutions to address scenarios with functional insufficiencies in perception, world modeling and localization. This article presents the Vision-based Lane Keeping System (VbLKS) developed by the DEI-Unipd Team within the context of the Bosch Future Mobility Challenge 2022. The main contribution lies in a Simulation-to-Reality (Sim2Real) GPS-denied VbLKS for a 1:10-scale autonomous vehicle. In this VbLKS, the input to a tailored Pure Pursuit (PP) based control strategy, namely the Lookahead Heading Error (LHE), is estimated at a constant lookahead distance employing a Convolutional Neural Network (CNN). A training strategy for a compact CNN is proposed, emphasizing data generation and augmentation on simulated camera images from a 3D Gazebo simulator, and enabling real-time operation on low-level hardware. A tailored PP-based lateral controller equipped with a derivative action and a PP-based velocity reference generation are implemented. Tuning ranges are established through a systematic time-delay stability analysis. Validation in a representative controlled laboratory setting is provided.
comment: 16 pages, 23 figures
DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models
Perception systems play a crucial role in autonomous driving, incorporating multiple sensors and corresponding computer vision algorithms. 3D LiDAR sensors are widely used to capture sparse point clouds of the vehicle's surroundings. However, such systems struggle to perceive occluded areas and gaps in the scene due to the sparsity of these point clouds and their lack of semantics. To address these challenges, Semantic Scene Completion (SSC) jointly predicts unobserved geometry and semantics in the scene given raw LiDAR measurements, aiming for a more complete scene representation. Building on promising results of diffusion models in image generation and super-resolution tasks, we propose their extension to SSC by implementing the noising and denoising diffusion processes in the point and semantic spaces individually. To control the generation, we employ semantic LiDAR point clouds as conditional input and design local and global regularization losses to stabilize the denoising process. We evaluate our approach on autonomous driving datasets and our approach outperforms the state-of-the-art for SSC.
comment: Under review
GSON: A Group-based Social Navigation Framework with Large Multimodal Model
As the number of service robots and autonomous vehicles in human-centered environments grows, their requirements go beyond simply navigating to a destination. They must also take into account dynamic social contexts and ensure respect and comfort for others in shared spaces, which poses significant challenges for perception and planning. In this paper, we present a group-based social navigation framework GSON to enable mobile robots to perceive and exploit the social group of their surroundings by leveling the visual reasoning capability of the Large Multimodal Model (LMM). For perception, we apply visual prompting techniques to zero-shot extract the social relationship among pedestrians and combine the result with a robust pedestrian detection and tracking pipeline to alleviate the problem of low inference speed of the LMM. Given the perception result, the planning system is designed to avoid disrupting the current social structure. We adopt a social structure-based mid-level planner as a bridge between global path planning and local motion planning to preserve the global context and reactive response. The proposed method is validated on real-world mobile robot navigation tasks involving complex social structure understanding and reasoning. Experimental results demonstrate the effectiveness of the system in these scenarios compared with several baselines.
SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
Automating garment manipulation poses a significant challenge for assistive robotics due to the diverse and deformable nature of garments. Traditional approaches typically require separate models for each garment type, which limits scalability and adaptability. In contrast, this paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories. By interpreting both visual and semantic information, our model enables robots to manage different garment states with a single model. We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data. Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates, providing a more flexible and general solution for robotic garment manipulation. In addition, this research also underscores the potential of VLMs to unify various garment manipulation tasks within a single framework, paving the way for broader applications in home automation and assistive robotics for future.
DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving
We present a novel autonomous driving framework, DualAD, designed to imitate human reasoning during driving. DualAD comprises two layers: a rule-based motion planner at the bottom layer that handles routine driving tasks requiring minimal reasoning, and an upper layer featuring a rule-based text encoder that converts driving scenarios from absolute states into text description. This text is then processed by a large language model (LLM) to make driving decisions. The upper layer intervenes in the bottom layer's decisions when potential danger is detected, mimicking human reasoning in critical situations. Closed-loop experiments demonstrate that DualAD, using a zero-shot pre-trained model, significantly outperforms rule-based motion planners that lack reasoning abilities. Our experiments also highlight the effectiveness of the text encoder, which considerably enhances the model's scenario understanding. Additionally, the integrated DualAD model improves with stronger LLMs, indicating the framework's potential for further enhancement. We make code and benchmarks publicly available.
comment: Autonomous Driving, Large Language Models (LLMs), Human Reasoning, Critical Scenario
Explaining Explaining
Explanation is key to people having confidence in high-stakes AI systems. However, machine-learning-based systems - which account for almost all current AI - can't explain because they are usually black boxes. The explainable AI (XAI) movement hedges this problem by redefining "explanation". The human-centered explainable AI (HCXAI) movement identifies the explanation-oriented needs of users but can't fulfill them because of its commitment to machine learning. In order to achieve the kinds of explanations needed by real people operating in critical domains, we must rethink how to approach AI. We describe a hybrid approach to developing cognitive agents that uses a knowledge-based infrastructure supplemented by data obtained through machine learning when applicable. These agents will serve as assistants to humans who will bear ultimate responsibility for the decisions and actions of the human-robot team. We illustrate the explanatory potential of such agents using the under-the-hood panels of a demonstration system in which a team of simulated robots collaborates on a search task assigned by a human.
Revisit Anything: Visual Place Recognition via Image Segment Retrieval ECCV 2024
Accurately recognizing a revisited place is crucial for embodied agents to localize and navigate. This requires visual representations to be distinct, despite strong variations in camera viewpoint and scene appearance. Existing visual place recognition pipelines encode the "whole" image and search for matches. This poses a fundamental challenge in matching two images of the same place captured from different camera viewpoints: "the similarity of what overlaps can be dominated by the dissimilarity of what does not overlap". We address this by encoding and searching for "image segments" instead of the whole images. We propose to use open-set image segmentation to decompose an image into `meaningful' entities (i.e., things and stuff). This enables us to create a novel image representation as a collection of multiple overlapping subgraphs connecting a segment with its neighboring segments, dubbed SuperSegment. Furthermore, to efficiently encode these SuperSegments into compact vector representations, we propose a novel factorized representation of feature aggregation. We show that retrieving these partial representations leads to significantly higher recognition recall than the typical whole image based retrieval. Our segments-based approach, dubbed SegVLAD, sets a new state-of-the-art in place recognition on a diverse selection of benchmark datasets, while being applicable to both generic and task-specialized image encoders. Finally, we demonstrate the potential of our method to ``revisit anything'' by evaluating our method on an object instance retrieval task, which bridges the two disparate areas of research: visual place recognition and object-goal navigation, through their common aim of recognizing goal objects specific to a place. Source code: https://github.com/AnyLoc/Revisit-Anything.
comment: Presented at ECCV 2024; Includes supplementary; 29 pages; 8 figures
HARMONIC: Cognitive and Control Collaboration in Human-Robotic Teams ICRA 2025
This paper presents a novel approach to multi-robot planning and collaboration. We demonstrate a cognitive strategy for robots in human-robot teams that incorporates metacognition, natural language communication, and explainability. The system is embodied using the HARMONIC architecture that flexibly integrates cognitive and control capabilities across the team. We evaluate our approach through simulation experiments involving a joint search task by a team of heterogeneous robots (a UGV and a drone) and a human. We detail the system's handling of complex, real-world scenarios, effective action coordination between robots with different capabilities, and natural human-robot communication. This work demonstrates that the robots' ability to reason about plans, goals, and attitudes, and to provide explanations for actions and decisions are essential prerequisites for realistic human-robot teaming.
comment: Submitted to ICRA 2025 Conference, Atlanta, GA, USA
MMDVS-LF: A Multi-Modal Dynamic-Vision-Sensor Line Following Dataset
Dynamic Vision Sensors (DVS), offer a unique advantage in control applications, due to their high temporal resolution, and asynchronous event-based data. Still, their adoption in machine learning algorithms remains limited. To address this gap, and promote the development of models that leverage the specific characteristics of DVS data, we introduce the Multi-Modal Dynamic-Vision-Sensor Line Following dataset (MMDVS-LF). This comprehensive dataset, is the first to integrate multiple sensor modalities, including DVS recordings, RGB video, odometry, and Inertial Measurement Unit (IMU) data, from a small-scale standardized vehicle. Additionally, the dataset includes eye-tracking and demographic data of drivers performing a Line Following task on a track. With its diverse range of data, MMDVS-LF opens new opportunities for developing deep learning algorithms, and conducting data science projects across various domains, supporting innovation in autonomous systems and control applications.
HARMONIC: A Framework for Explanatory Cognitive Robots ICRA
We present HARMONIC, a framework for implementing cognitive robots that transforms general-purpose robots into trusted teammates capable of complex decision-making, natural communication and human-level explanation. The framework supports interoperability between a strategic (cognitive) layer for high-level decision-making and a tactical (robot) layer for low-level control and execution. We describe the core features of the framework and our initial implementation, in which HARMONIC was deployed on a simulated UGV and drone involved in a multi-robot search and retrieval task.
comment: Accepted for presentation at ICRA@40. 23-26 September 2024, Rotterdam, Netherlands
Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving
Autonomous driving system aims for safe and social-consistent driving through the behavioral integration among interactive agents. However, challenges remain due to multi-agent scene uncertainty and heterogeneous interaction. Current dense and sparse behavioral representations struggle with inefficiency and inconsistency in multi-agent modeling, leading to instability of collective behavioral patterns when integrating prediction and planning (IPP). To address this, we initiate a topological formation that serves as a compliant behavioral foreground to guide downstream trajectory generations. Specifically, we introduce Behavioral Topology (BeTop), a pivotal topological formulation that explicitly represents the consensual behavioral pattern among multi-agent future. BeTop is derived from braid theory to distill compliant interactive topology from multi-agent future trajectories. A synergistic learning framework (BeTopNet) supervised by BeTop facilitates the consistency of behavior prediction and planning within the predicted topology priors. Through imitative contingency learning, BeTop also effectively manages behavioral uncertainty for prediction and planning. Extensive verification on large-scale real-world datasets, including nuPlan and WOMD, demonstrates that BeTop achieves state-of-the-art performance in both prediction and planning tasks. Further validations on the proposed interactive scenario benchmark showcase planning compliance in interactive cases.
ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning
Vision-centric semantic occupancy prediction plays a crucial role in autonomous driving, which requires accurate and reliable predictions from low-cost sensors. Although having notably narrowed the accuracy gap with LiDAR, there is still few research effort to explore the reliability in predicting semantic occupancy from camera. In this paper, we conduct a comprehensive evaluation of existing semantic occupancy prediction models from a reliability perspective for the first time. Despite the gradual alignment of camera-based models with LiDAR in term of accuracy, a significant reliability gap persists. To addresses this concern, we propose ReliOcc, a method designed to enhance the reliability of camera-based occupancy networks. ReliOcc provides a plug-and-play scheme for existing models, which integrates hybrid uncertainty from individual voxels with sampling-based noise and relative voxels through mix-up learning. Besides, an uncertainty-aware calibration strategy is devised to further enhance model reliability in offline mode. Extensive experiments under various settings demonstrate that ReliOcc significantly enhances model reliability while maintaining the accuracy of both geometric and semantic predictions. Importantly, our proposed approach exhibits robustness to sensor failures and out of domain noises during inference.
comment: Technical report. Work in progress
Control Industrial Automation System with Large Language Models
Traditional industrial automation systems require specialized expertise to operate and complex reprogramming to adapt to new processes. Large language models offer the intelligence to make them more flexible and easier to use. However, LLMs' application in industrial settings is underexplored. This paper introduces a framework for integrating LLMs to achieve end-to-end control of industrial automation systems. At the core of the framework are an agent system designed for industrial tasks, a structured prompting method, and an event-driven information modeling mechanism that provides real-time data for LLM inference. The framework supplies LLMs with real-time events on different context semantic levels, allowing them to interpret the information, generate production plans, and control operations on the automation system. It also supports structured dataset creation for fine-tuning on this downstream application of LLMs. Our contribution includes a formal system design, proof-of-concept implementation, and a method for generating task-specific datasets for LLM fine-tuning and testing. This approach enables a more adaptive automation system that can respond to spontaneous events, while allowing easier operation and configuration through natural language for more intuitive human-machine interaction. We provide demo videos and detailed data on GitHub: https://github.com/YuchenXia/LLM4IAS
Joint Localization and Planning using Diffusion ICRA 2025
Diffusion models have been successfully applied to robotics problems such as manipulation and vehicle path planning. In this work, we explore their application to end-to-end navigation -- including both perception and planning -- by considering the problem of jointly performing global localization and path planning in known but arbitrary 2D environments. In particular, we introduce a diffusion model which produces collision-free paths in a global reference frame given an egocentric LIDAR scan, an arbitrary map, and a desired goal position. To this end, we implement diffusion in the space of paths in SE(2), and describe how to condition the denoising process on both obstacles and sensor observations. In our evaluation, we show that the proposed conditioning techniques enable generalization to realistic maps of considerably different appearance than the training environment, demonstrate our model's ability to accurately describe ambiguous solutions, and run extensive simulation experiments showcasing our model's use as a real-time, end-to-end localization and planning stack.
comment: 7 pages, 9 figures. Submitted to ICRA 2025, under review
LoopSR: Looping Sim-and-Real for Lifelong Policy Adaptation of Legged Robots
Reinforcement Learning (RL) has shown its remarkable and generalizable capability in legged locomotion through sim-to-real transfer. However, while adaptive methods like domain randomization are expected to make policy more robust to diverse environments, such comprehensiveness potentially detracts from the policy's performance in any specific environment according to the No Free Lunch theorem, leading to a suboptimal solution once deployed in the real world. To address this issue, we propose a lifelong policy adaptation framework named LoopSR, which utilizes a transformer-based encoder to project real-world trajectories into a latent space, and accordingly reconstruct the real-world environments back in simulation for further improvement. Autoencoder architecture and contrastive learning methods are adopted to better extract the characteristics of real-world dynamics. The simulation parameters for continual training are derived by combining predicted parameters from the decoder with retrieved parameters from the simulation trajectory dataset. By leveraging the continual training, LoopSR achieves superior data efficiency compared with strong baselines, with only a limited amount of data to yield eminent performance in both sim-to-sim and sim-to-real experiments.
comment: under review
Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions ECCV 2024
The stark contrast in the design philosophy of an event camera makes it particularly ideal for operating under high-speed, high dynamic range and low-light conditions, where standard cameras underperform. Nonetheless, event cameras still suffer from some amount of motion blur, especially under these challenging conditions, in contrary to what most think. This is attributed to the limited bandwidth of the event sensor pixel, which is mostly proportional to the light intensity. Thus, to ensure that event cameras can truly excel in such conditions where it has an edge over standard cameras, it is crucial to account for event motion blur in downstream applications, especially reconstruction. However, none of the recent works on reconstructing Neural Radiance Fields (NeRFs) from events, nor event simulators, have considered the full effects of event motion blur. To this end, we propose, Deblur e-NeRF, a novel method to directly and effectively reconstruct blur-minimal NeRFs from motion-blurred events generated under high-speed motion or low-light conditions. The core component of this work is a physically-accurate pixel bandwidth model proposed to account for event motion blur under arbitrary speed and lighting conditions. We also introduce a novel threshold-normalized total variation loss to improve the regularization of large textureless patches. Experiments on real and novel realistically simulated sequences verify our effectiveness. Our code, event simulator and synthetic event dataset will be open-sourced.
comment: Accepted to ECCV 2024. Project website is accessible at https://wengflow.github.io/deblur-e-nerf. arXiv admin note: text overlap with arXiv:2006.07722 by other authors
Model-Free versus Model-Based Reinforcement Learning for Fixed-Wing UAV Attitude Control Under Varying Wind Conditions
This paper evaluates and compares the performance of model-free and model-based reinforcement learning for the attitude control of fixed-wing unmanned aerial vehicles using PID as a reference point. The comparison focuses on their ability to handle varying flight dynamics and wind disturbances in a simulated environment. Our results show that the Temporal Difference Model Predictive Control agent outperforms both the PID controller and other model-free reinforcement learning methods in terms of tracking accuracy and robustness over different reference difficulties, particularly in nonlinear flight regimes. Furthermore, we introduce actuation fluctuation as a key metric to assess energy efficiency and actuator wear, and we test two different approaches from the literature: action variation penalty and conditioning for action policy smoothness. We also evaluate all control methods when subject to stochastic turbulence and gusts separately, so as to measure their effects on tracking performance, observe their limitations and outline their implications on the Markov decision process formalism.
comment: Published at ICINCO 2024
Swarm-LIO2: Decentralized, Efficient LiDAR-inertial Odometry for UAV Swarms
Aerial swarm systems possess immense potential in various aspects, such as cooperative exploration, target tracking, search and rescue. Efficient, accurate self and mutual state estimation are the critical preconditions for completing these swarm tasks, which remain challenging research topics. This paper proposes Swarm-LIO2: a fully decentralized, plug-and-play, computationally efficient, and bandwidth-efficient LiDAR-inertial odometry for aerial swarm systems. Swarm-LIO2 uses a decentralized, plug-and-play network as the communication infrastructure. Only bandwidth-efficient and low-dimensional information is exchanged, including identity, ego-state, mutual observation measurements, and global extrinsic transformations. To support the plug-and-play of new teammate participants, Swarm-LIO2 detects potential teammate UAVs and initializes the temporal offset and global extrinsic transformation all automatically. To enhance the initialization efficiency, novel reflectivity-based UAV detection, trajectory matching, and factor graph optimization methods are proposed. For state estimation, Swarm-LIO2 fuses LiDAR, IMU, and mutual observation measurements within an efficient ESIKF framework, with careful compensation of temporal delay and modeling of measurements to enhance the accuracy and consistency.
comment: 23 Pages
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning
This paper addresses a challenging interactive task learning scenario we call rearrangement under unawareness: to manipulate a rigid-body environment in a context where the robot is unaware of a concept that's key to solving the instructed task. We propose SECURE, an interactive task learning framework designed to solve such problems by fixing a deficient domain model using embodied conversation. Through dialogue, the robot discovers and then learns to exploit unforeseen possibilities. Using SECURE, the robot not only learns from the user's corrective feedback when it makes a mistake, but it also learns to make strategic dialogue decisions for revealing useful evidence about novel concepts for solving the instructed task. Together, these abilities allow the robot to generalise to subsequent tasks using newly acquired knowledge. We demonstrate that a robot that is semantics-aware -- that is, it exploits the logical consequences of both sentence and discourse semantics in the learning and inference process -- learns to solve rearrangement under unawareness more effectively than a robot that lacks such capabilities.
comment: 10 pages,4 figures, 2 tables
Robust Ladder Climbing with a Quadrupedal Robot
Quadruped robots are proliferating in industrial environments where they carry sensor suites and serve as autonomous inspection platforms. Despite the advantages of legged robots over their wheeled counterparts on rough and uneven terrain, they are still yet to be able to reliably negotiate ubiquitous features of industrial infrastructure: ladders. Inability to traverse ladders prevents quadrupeds from inspecting dangerous locations, puts humans in harm's way, and reduces industrial site productivity. In this paper, we learn quadrupedal ladder climbing via a reinforcement learning-based control policy and a complementary hooked end-effector. We evaluate the robustness in simulation across different ladder inclinations, rung geometries, and inter-rung spacings. On hardware, we demonstrate zero-shot transfer with an overall 90% success rate at ladder angles ranging from 70{\deg} to 90{\deg}, consistent climbing performance during unmodeled perturbations, and climbing speeds 232x faster than the state of the art. This work expands the scope of industrial quadruped robot applications beyond inspection on nominal terrains to challenging infrastructural features in the environment, highlighting synergies between robot morphology and control policy when performing complex skills. More information can be found at the project website: https://sites.google.com/leggedrobotics.com/climbingladders.
comment: Project website: https://sites.google.com/leggedrobotics.com/climbingladders
Robotic-CLIP: Fine-tuning CLIP on Action Data for Robotic Applications
Vision language models have played a key role in extracting meaningful features for various robotic applications. Among these, Contrastive Language-Image Pretraining (CLIP) is widely used in robotic tasks that require both vision and natural language understanding. However, CLIP was trained solely on static images paired with text prompts and has not yet been fully adapted for robotic tasks involving dynamic actions. In this paper, we introduce Robotic-CLIP to enhance robotic perception capabilities. We first gather and label large-scale action data, and then build our Robotic-CLIP by fine-tuning CLIP on 309,433 videos (~7.4 million frames) of action data using contrastive learning. By leveraging action data, Robotic-CLIP inherits CLIP's strong image performance while gaining the ability to understand actions in robotic contexts. Intensive experiments show that our Robotic-CLIP outperforms other CLIP-based models across various language-driven robotic tasks. Additionally, we demonstrate the practical effectiveness of Robotic-CLIP in real-world grasping applications.
comment: 7 pages
Stable Object Placement Under Geometric Uncertainty via Differentiable Contact Dynamics
From serving a cup of coffee to carefully rearranging delicate items, stable object placement is a crucial skill for future robots. This skill is challenging due to the required accuracy, which is difficult to achieve under geometric uncertainty. We leverage differentiable contact dynamics to develop a principled method for stable object placement under geometric uncertainty. We estimate the geometric uncertainty by minimizing the discrepancy between the force-torque sensor readings and the model predictions through gradient descent. We further keep track of a belief over multiple possible geometric parameters to mitigate the gradient-based method's sensitivity to the initialization. We verify our approach in the real world on various geometric uncertainties, including the in-hand pose uncertainty of the grasped object, the object's shape uncertainty, and the environment's shape uncertainty.
Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes
With robots increasingly collaborating with humans in everyday tasks, it is important to take steps toward robotic systems capable of understanding the environment. This work focuses on scene understanding to detect pick and place tasks given initial and final images from the scene. To this end, a dataset is collected for object detection and pick and place task detection. A YOLOv5 network is subsequently trained to detect the objects in the initial and final scenes. Given the detected objects and their bounding boxes, two methods are proposed to detect the pick and place tasks which transform the initial scene into the final scene. A geometric method is proposed which tracks objects' movements in the two scenes and works based on the intersection of the bounding boxes which moved within scenes. Contrarily, the CNN-based method utilizes a Convolutional Neural Network to classify objects with intersected bounding boxes into 5 classes, showing the spatial relationship between the involved objects. The performed pick and place tasks are then derived from analyzing the experiments with both scenes. Results show that the CNN-based method, using a VGG16 backbone, outscores the geometric method by roughly 12 percentage points in certain scenarios, with an overall success rate of 84.3%.
comment: Conference Paper, ICEE 2024, 7 pages, 5 figures
Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
Verbalization of robot experience, i.e., summarization of and question answering about a robot's past, is a crucial ability for improving human-robot interaction. Previous works applied rule-based systems or fine-tuned deep models to verbalize short (several-minute-long) streams of episodic data, limiting generalization and transferability. In our work, we apply large pretrained models to tackle this task with zero or few examples, and specifically focus on verbalizing life-long experiences. For this, we derive a tree-like data structure from episodic memory (EM), with lower levels representing raw perception and proprioception data, and higher levels abstracting events to natural language concepts. Given such a hierarchical representation built from the experience stream, we apply a large language model as an agent to interactively search the EM given a user's query, dynamically expanding (initially collapsed) tree nodes to find the relevant information. The approach keeps computational costs low even when scaling to months of robot experience data. We evaluate our method on simulated household robot data, human egocentric videos, and real-world robot recordings, demonstrating its flexibility and scalability.
comment: Code, data and demo videos at https://hierarchical-emv.github.io
Event-based Stereo Depth Estimation: A Survey
Stereopsis has widespread appeal in robotics as it is the predominant way by which living beings perceive depth to navigate our 3D world. Event cameras are novel bio-inspired sensors that detect per-pixel brightness changes asynchronously, with very high temporal resolution and high dynamic range, enabling machine perception in high-speed motion and broad illumination conditions. The high temporal precision also benefits stereo matching, making disparity (depth) estimation a popular research area for event cameras ever since its inception. Over the last 30 years, the field has evolved rapidly, from low-latency, low-power circuit design to current deep learning (DL) approaches driven by the computer vision community. The bibliography is vast and difficult to navigate for non-experts due its highly interdisciplinary nature. Past surveys have addressed distinct aspects of this topic, in the context of applications, or focusing only on a specific class of techniques, but have overlooked stereo datasets. This survey provides a comprehensive overview, covering both instantaneous stereo and long-term methods suitable for simultaneous localization and mapping (SLAM), along with theoretical and empirical comparisons. It is the first to extensively review DL methods as well as stereo datasets, even providing practical suggestions for creating new benchmarks to advance the field. The main advantages and challenges faced by event-based stereo depth estimation are also discussed. Despite significant progress, challenges remain in achieving optimal performance in not only accuracy but also efficiency, a cornerstone of event-based computing. We identify several gaps and propose future research directions. We hope this survey inspires future research in this area, by serving as an accessible entry point for newcomers, as well as a practical guide for seasoned researchers in the community.
comment: 28 pages, 20 figures, 7 tables
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment
The increasing demand for intelligent assistants in human-populated environments has motivated significant research in autonomous robotic systems. Traditional service robots and virtual assistants, however, struggle with real-world task execution due to their limited capacity for dynamic reasoning and interaction, particularly when human collaboration is required. Recent developments in Large Language Models have opened new avenues for improving these systems, enabling more sophisticated reasoning and natural interaction capabilities. In this paper, we introduce AssistantX, an LLM-powered proactive assistant designed to operate autonomously in a physical office environment. Unlike conventional service robots, AssistantX leverages a novel multi-agent architecture, PPDR4X, which provides advanced inference capabilities and comprehensive collaboration awareness. By effectively bridging the gap between virtual operations and physical interactions, AssistantX demonstrates robust performance in managing complex real-world scenarios. Our evaluation highlights the architecture's effectiveness, showing that AssistantX can respond to clear instructions, actively retrieve supplementary information from memory, and proactively seek collaboration from team members to ensure successful task completion. More details and videos can be found at https://assistantx-agent.github.io/AssistantX/.
comment: 6 pages, 8 figures, 4 tables
FactorSim: Generative Simulation via Factorized Representation
Generating simulations to train intelligent agents in game-playing and robotics from natural language input, from user input or task documentation, remains an open-ended challenge. Existing approaches focus on parts of this challenge, such as generating reward functions or task hyperparameters. Unlike previous work, we introduce FACTORSIM that generates full simulations in code from language input that can be used to train agents. Exploiting the structural modularity specific to coded simulations, we propose to use a factored partially observable Markov decision process representation that allows us to reduce context dependence during each step of the generation. For evaluation, we introduce a generative simulation benchmark that assesses the generated simulation code's accuracy and effectiveness in facilitating zero-shot transfers in reinforcement learning settings. We show that FACTORSIM outperforms existing methods in generating simulations regarding prompt alignment (e.g., accuracy), zero-shot transfer abilities, and human evaluation. We also demonstrate its effectiveness in generating robotic tasks.
comment: neurips 2024, project website: https://cs.stanford.edu/~sunfanyun/factorsim/
AP-VLM: Active Perception Enabled by Vision-Language Models
Active perception enables robots to dynamically gather information by adjusting their viewpoints, a crucial capability for interacting with complex, partially observable environments. In this paper, we present AP-VLM, a novel framework that combines active perception with a Vision-Language Model (VLM) to guide robotic exploration and answer semantic queries. Using a 3D virtual grid overlaid on the scene and orientation adjustments, AP-VLM allows a robotic manipulator to intelligently select optimal viewpoints and orientations to resolve challenging tasks, such as identifying objects in occluded or inclined positions. We evaluate our system on two robotic platforms: a 7-DOF Franka Panda and a 6-DOF UR5, across various scenes with differing object configurations. Our results demonstrate that AP-VLM significantly outperforms passive perception methods and baseline models, including Toward Grounded Common Sense Reasoning (TGCSR), particularly in scenarios where fixed camera views are inadequate. The adaptability of AP-VLM in real-world settings shows promise for enhancing robotic systems' understanding of complex environments, bridging the gap between high-level semantic reasoning and low-level control.
System-Level Safety Monitoring and Recovery for Perception Failures in Autonomous Vehicles
The safety-critical nature of autonomous vehicle (AV) operation necessitates development of task-relevant algorithms that can reason about safety at the system level and not just at the component level. To reason about the impact of a perception failure on the entire system performance, such task-relevant algorithms must contend with various challenges: complexity of AV stacks, high uncertainty in the operating environments, and the need for real-time performance. To overcome these challenges, in this work, we introduce a Q-network called SPARQ (abbreviation for Safety evaluation for Perception And Recovery Q-network) that evaluates the safety of a plan generated by a planning algorithm, accounting for perception failures that the planning process may have overlooked. This Q-network can be queried during system runtime to assess whether a proposed plan is safe for execution or poses potential safety risks. If a violation is detected, the network can then recommend a corrective plan while accounting for the perceptual failure. We validate our algorithm using the NuPlan-Vegas dataset, demonstrating its ability to handle cases where a perception failure compromises a proposed plan while the corrective plan remains safe. We observe an overall accuracy and recall of 90% while sustaining a frequency of 42Hz on the unseen testing dataset. We compare our performance to a popular reachability-based baseline and analyze some interesting properties of our approach in improving the safety properties of an AV pipeline.
HGS-Planner: Hierarchical Planning Framework for Active Scene Reconstruction Using 3D Gaussian Splatting
In complex missions such as search and rescue,robots must make intelligent decisions in unknown environments, relying on their ability to perceive and understand their surroundings. High-quality and real-time reconstruction enhances situational awareness and is crucial for intelligent robotics. Traditional methods often struggle with poor scene representation or are too slow for real-time use. Inspired by the efficacy of 3D Gaussian Splatting (3DGS), we propose a hierarchical planning framework for fast and high-fidelity active reconstruction. Our method evaluates completion and quality gain to adaptively guide reconstruction, integrating global and local planning for efficiency. Experiments in simulated and real-world environments show our approach outperforms existing real-time methods.
Leveraging Semantic and Geometric Information for Zero-Shot Robot-to-Human Handover
Human-robot interaction (HRI) encompasses a wide range of collaborative tasks, with handover being one of the most fundamental. As robots become more integrated into human environments, the potential for service robots to assist in handing objects to humans is increasingly promising. In robot-to-human (R2H) handover, selecting the optimal grasp is crucial for success, as it requires avoiding interference with the humans preferred grasp region and minimizing intrusion into their workspace. Existing methods either inadequately consider geometric information or rely on data-driven approaches, which often struggle to generalize across diverse objects. To address these limitations, we propose a novel zero-shot system that combines semantic and geometric information to generate optimal handover grasps. Our method first identifies grasp regions using semantic knowledge from vision-language models (VLMs) and, by incorporating customized visual prompts, achieves finer granularity in region grounding. A grasp is then selected based on grasp distance and approach angle to maximize human ease and avoid interference. We validate our approach through ablation studies and real-world comparison experiments. Results demonstrate that our system improves handover success rates and provides a more user-preferred interaction experience. Videos, appendixes and more are available at https://sites.google.com/view/vlm-handover/.
comment: 6 pages, 5 figures, conference
Learning Occlusion-aware Decision-making from Agent Interaction via Active Perception
Occlusion-aware decision-making is essential in autonomous driving due to the high uncertainty of various occlusions. Recent occlusion-aware decision-making methods encounter issues such as high computational complexity, scenario scalability challenges, or reliance on limited expert data. Benefiting from automatically generating data by exploration randomization, we uncover that reinforcement learning (RL) may show promise in occlusion-aware decision-making. However, previous occlusion-aware RL faces challenges in expanding to various dynamic and static occlusion scenarios, low learning efficiency, and lack of predictive ability. To address these issues, we introduce Pad-AI, a self-reinforcing framework to learn occlusion-aware decision-making through active perception. Pad-AI utilizes vectorized representation to represent occluded environments efficiently and learns over the semantic motion primitives to focus on high-level active perception exploration. Furthermore, Pad-AI integrates prediction and RL within a unified framework to provide risk-aware learning and security guarantees. Our framework was tested in challenging scenarios under both dynamic and static occlusions and demonstrated efficient and general perception-aware exploration performance to other strong baselines in closed-loop evaluations.
Software for the SpaceDREAM Robotic Arm
Impedance-controlled robots are widely used on Earth to perform interaction-rich tasks and will be a key enabler for In-Space Servicing, Assembly and Manufacturing (ISAM) activities. This paper introduces the software architecture used on the On-Board Computer (OBC) for the planned SpaceDREAM mission aiming to validate such robotic arm in Lower Earth Orbit (LEO) conducted by the German Aerospace Center (DLR) in cooperation with KINETIK Space GmbH and the Technical University of Munich (TUM). During the mission several free motion as well as contact tasks are to be performed in order to verify proper functionality of the robot in position and impedance control on joint level as well as in cartesian control. The tasks are selected to be representative for subsequent servicing missions e.g. requiring interface docking or precise manipulation. The software on the OBC commands the robot's joints via SpaceWire to perform those mission tasks, reads camera images and data from additional sensors and sends telemetry data through an Ethernet link via the spacecraft down to Earth. It is set up to execute a predefined mission after receiving a start signal from the spacecraft while it should be extendable to receive commands from Earth for later missions. Core design principle was to reuse as much existing software and to stay as close as possible to existing robot software stacks at DLR. This allowed for a quick full operational start of the robot arm compared to a custom development of all robot software, a lower entry barrier for software developers as well as a reuse of existing libraries. While not every line of code can be tested with this design, most of the software has already proven its functionality through daily execution on multiple robot systems.
Canonical Representation and Force-Based Pretraining of 3D Tactile for Dexterous Visuo-Tactile Policy Learning
Tactile sensing plays a vital role in enabling robots to perform fine-grained, contact-rich tasks. However, the high dimensionality of tactile data, due to the large coverage on dexterous hands, poses significant challenges for effective tactile feature learning, especially for 3D tactile data, as there are no large standardized datasets and no strong pretrained backbones. To address these challenges, we propose a novel canonical representation that reduces the difficulty of 3D tactile feature learning and further introduces a force-based self-supervised pretraining task to capture both local and net force features, which are crucial for dexterous manipulation. Our method achieves an average success rate of 78% across four fine-grained, contact-rich dexterous manipulation tasks in real-world experiments, demonstrating effectiveness and robustness compared to other methods. Further analysis shows that our method fully utilizes both spatial and force information from 3D tactile data to accomplish the tasks. The videos can be viewed at https://3dtacdex.github.io.
Robotic Environmental State Recognition with Pre-Trained Vision-Language Models and Black-Box Optimization
In order for robots to autonomously navigate and operate in diverse environments, it is essential for them to recognize the state of their environment. On the other hand, the environmental state recognition has traditionally involved distinct methods tailored to each state to be recognized. In this study, we perform a unified environmental state recognition for robots through the spoken language with pre-trained large-scale vision-language models. We apply Visual Question Answering and Image-to-Text Retrieval, which are tasks of Vision-Language Models. We show that with our method, it is possible to recognize not only whether a room door is open/closed, but also whether a transparent door is open/closed and whether water is running in a sink, without training neural networks or manual programming. In addition, the recognition accuracy can be improved by selecting appropriate texts from the set of prepared texts based on black-box optimization. For each state recognition, only the text set and its weighting need to be changed, eliminating the need to prepare multiple different models and programs, and facilitating the management of source code and computer resource. We experimentally demonstrate the effectiveness of our method and apply it to the recognition behavior on a mobile robot, Fetch.
comment: Accepted at Advanced Robotics, website - https://haraduka.github.io/vlm-bbo/
Precise Interception Flight Targets by Image-based Visual Servoing of Multicopter
Interception of low-altitude intruding targets with low-cost drones equipped strapdown camera presents a competitive option. However, the malicious maneuvers by the non-cooperative target and the coupling of the camera make the task challenging. To solve this problem, an Image-Based Visual Servoing (IBVS) control algorithm based on proportional navigation guidance with field-of-view holding capability is designed. The proposed controller reduces the miss distance while improving the stability of the visual servo system during interception. Software-in-the-loop (SITL) simulation experiments show a 72.8% reduction in the circular error probability (CEP) compared to the most recent study. This improvement enhances interception accuracy from the decimeter to the centimeter level. Real-world experiments further validate the effectiveness of the proposed algorithm.
comment: 9 pages, 15 figures, In the process of being submitted to the Journal of IEEE Transactions on Industrial Electronics
Traverse the Non-Traversable: Estimating Traversability for Wheeled Mobility on Vertically Challenging Terrain
Most traversability estimation techniques divide off-road terrain into traversable (e.g., pavement, gravel, and grass) and non-traversable (e.g., boulders, vegetation, and ditches) regions and then inform subsequent planners to produce trajectories on the traversable part. However, recent research demonstrated that wheeled robots can traverse vertically challenging terrain (e.g., extremely rugged boulders comparable in size to the vehicles themselves), which unfortunately would be deemed as non-traversable by existing techniques. Motivated by such limitations, this work aims at identifying the traversable from the seemingly non-traversable, vertically challenging terrain based on past kinodynamic vehicle-terrain interactions in a data-driven manner. Our new Traverse the Non-Traversable(TNT) traversability estimator can efficiently guide a down-stream sampling-based planner containing a high-precision 6-DoF kinodynamic model, which becomes deployable onboard a small-scale vehicle. Additionally, the estimated traversability can also be used as a costmap to plan global and local paths without sampling. Our experiment results show that TNT can improve planning performance, efficiency, and stability by 50%, 26.7%, and 9.2% respectively on a physical robot platform.
comment: for associated video file, see https://www.youtube.com/watch?v=Shcalb8sGcA
Tactile Probabilistic Contact Dynamics Estimation of Unknown Objects
We study the problem of rapidly identifying contact dynamics of unknown objects in partially known environments. The key innovation of our method is a novel formulation of the contact dynamics estimation problem as the joint estimation of contact geometries and physical parameters. We leverage DeepSDF, a compact and expressive neural-network-based geometry representation over a distribution of geometries, and adopt a particle filter to estimate both the geometries in contact and the physical parameters. In addition, we couple the estimator with an active exploration strategy that plans information-gathering moves to further expedite online estimation. Through simulation and physical experiments, we show that our method estimates accurate contact dynamics with fewer than 30 exploration moves for unknown objects touching partially known environments.
Verti-Selector: Automatic Curriculum Learning for Wheeled Mobility on Vertically Challenging Terrain
Reinforcement Learning (RL) has the potential to enable extreme off-road mobility by circumventing complex kinodynamic modeling, planning, and control by simulated end-to-end trial-and-error learning experiences. However, most RL methods are sample-inefficient when training in a large amount of manually designed simulation environments and struggle at generalizing to the real world. To address these issues, we introduce Verti-Selector (VS), an automatic curriculum learning framework designed to enhance learning efficiency and generalization by selectively sampling training terrain. VS prioritizes vertically challenging terrain with higher Temporal Difference (TD) errors when revisited, thereby allowing robots to learn at the edge of their evolving capabilities. By dynamically adjusting the sampling focus, VS significantly boosts sample efficiency and generalization within the VW-Chrono simulator built on the Chrono multi-physics engine. Furthermore, we provide simulation and physical results using VS on a Verti-4-Wheeler platform. These results demonstrate that VS can achieve 23.08% improvement in terms of success rate by efficiently sampling during training and robustly generalizing to the real world.
Cat-and-Mouse Satellite Dynamics: Divergent Adversarial Reinforcement Learning for Contested Multi-Agent Space Operations
As space becomes increasingly crowded and contested, robust autonomous capabilities for multi-agent environments are gaining critical importance. Current autonomous systems in space primarily rely on optimization-based path planning or long-range orbital maneuvers, which have not yet proven effective in adversarial scenarios where one satellite is actively pursuing another. We introduce Divergent Adversarial Reinforcement Learning (DARL), a two-stage Multi-Agent Reinforcement Learning (MARL) approach designed to train autonomous evasion strategies for satellites engaged with multiple adversarial spacecraft. Our method enhances exploration during training by promoting diverse adversarial strategies, leading to more robust and adaptable evader models. We validate DARL through a cat-and-mouse satellite scenario, modeled as a partially observable multi-agent capture the flag game where two adversarial `cat' spacecraft pursue a single `mouse' evader. DARL's performance is compared against several benchmarks, including an optimization-based satellite path planner, demonstrating its ability to produce highly robust models for adversarial multi-agent space environments.
Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation
Imitation learning has demonstrated significant potential in performing high-precision manipulation tasks using visual feedback from cameras. However, it is common practice in imitation learning for cameras to be fixed in place, resulting in issues like occlusion and limited field of view. Furthermore, cameras are often placed in broad, general locations, without an effective viewpoint specific to the robot's task. In this work, we investigate the utility of active vision (AV) for imitation learning and manipulation, in which, in addition to the manipulation policy, the robot learns an AV policy from human demonstrations to dynamically change the robot's camera viewpoint to obtain better information about its environment and the given task. We introduce AV-ALOHA, a new bimanual teleoperation robot system with AV, an extension of the ALOHA 2 robot system, incorporating an additional 7-DoF robot arm that only carries a stereo camera and is solely tasked with finding the best viewpoint. This camera streams stereo video to an operator wearing a virtual reality (VR) headset, allowing the operator to control the camera pose using head and body movements. The system provides an immersive teleoperation experience, with bimanual first-person control, enabling the operator to dynamically explore and search the scene and simultaneously interact with the environment. We conduct imitation learning experiments of our system both in real-world and in simulation, across a variety of tasks that emphasize viewpoint planning. Our results demonstrate the effectiveness of human-guided AV for imitation learning, showing significant improvements over fixed cameras in tasks with limited visibility. Project website: https://soltanilara.github.io/av-aloha/
comment: 6 pages, 4 figures
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2025
We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations, so that at runtime it can recover from perturbations outside the training distribution. Additionally, we introduce a novel transformer-based perception encoder that employs multi-view cross-attention and a learned scene query. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing in the CARLA simulator, as well as showing the ability to handle perturbations in both CARLA and NVIDIA's DRIVE Sim.
comment: 7 pages, 6 figures, for ICRA 2025 conference, for associated video file, see https://youtu.be/fO7RZ57gVxk
A Learning Framework for Diverse Legged Robot Locomotion Using Barrier-Based Style Rewards
This work introduces a model-free reinforcement learning framework that enables various modes of motion (quadruped, tripod, or biped) and diverse tasks for legged robot locomotion. We employ a motion-style reward based on a relaxed logarithmic barrier function as a soft constraint, to bias the learning process toward the desired motion style, such as gait, foot clearance, joint position, or body height. The predefined gait cycle is encoded in a flexible manner, facilitating gait adjustments throughout the learning process. Extensive experiments demonstrate that KAIST HOUND, a 45 kg robotic system, can achieve biped, tripod, and quadruped locomotion using the proposed framework; quadrupedal capabilities include traversing uneven terrain, galloping at 4.67 m/s, and overcoming obstacles up to 58 cm (67 cm for HOUND2); bipedal capabilities include running at 3.6 m/s, carrying a 7.5 kg object, and ascending stairs-all performed without exteroceptive input.
comment: 7 pages, 5 figures, Videos at https://youtu.be/JV2_HfTlOKI
Exploring Event-based Human Pose Estimation with 3D Event Representations
Human pose estimation is a fundamental and appealing task in computer vision. Although traditional cameras are commonly applied, their reliability decreases in scenarios under high dynamic range or heavy motion blur, where event cameras offer a robust solution. Predominant event-based methods accumulate events into frames, ignoring the asynchronous and high temporal resolution that is crucial for distinguishing distinct actions. To address this issue and to unlock the 3D potential of event information, we introduce two 3D event representations: the Rasterized Event Point Cloud (RasEPC) and the Decoupled Event Voxel (DEV). The RasEPC aggregates events within concise temporal slices at identical positions, preserving their 3D attributes along with statistical information, thereby significantly reducing memory and computational demands. Meanwhile, the DEV representation discretizes events into voxels and projects them across three orthogonal planes, utilizing decoupled event attention to retrieve 3D cues from the 2D planes. Furthermore, we develop and release EV-3DPW, a synthetic event-based dataset crafted to facilitate training and quantitative analysis in outdoor scenes. Our methods are tested on the DHP19 public dataset, MMHPSD dataset, and our EV-3DPW dataset, with further qualitative validation via a derived driving scene dataset EV-JAAD and an outdoor collection vehicle. Our code and dataset have been made publicly available at https://github.com/MasterHow/EventPointPose.
comment: Accepted to Computer Vision and Image Understanding (CVPU). Extended version of arXiv:2206.04511. The code and dataset are available at https://github.com/MasterHow/EventPointPose
Valeo4Cast: A Modular Approach to End-to-End Forecasting ECCV
Motion forecasting is crucial in autonomous driving systems to anticipate the future trajectories of surrounding agents such as pedestrians, vehicles, and traffic signals. In end-to-end forecasting, the model must jointly detect and track from sensor data (cameras or LiDARs) the past trajectories of the different elements of the scene and predict their future locations. We depart from the current trend of tackling this task via end-to-end training from perception to forecasting, and instead use a modular approach. We individually build and train detection, tracking and forecasting modules. We then only use consecutive finetuning steps to integrate the modules better and alleviate compounding errors. We conduct an in-depth study on the finetuning strategies and it reveals that our simple yet effective approach significantly improves performance on the end-to-end forecasting benchmark. Consequently, our solution ranks first in the Argoverse 2 End-to-end Forecasting Challenge, with 63.82 mAPf. We surpass forecasting results by +17.1 points over last year's winner and by +13.3 points over this year's runner-up. This remarkable performance in forecasting can be explained by our modular paradigm, which integrates finetuning strategies and significantly outperforms the end-to-end-trained counterparts. The code, model weights and results are made available https://github.com/valeoai/valeo4cast.
comment: Winning solution of the Argoverse 2 "Unified Detection, Tracking, and Forecasting" challenge; work accepted at Road++ ECCVW 2024
TypeFly: Flying Drones with Large Language Model
Recent advancements in robot control using large language models (LLMs) have demonstrated significant potential, primarily due to LLMs' capabilities to understand natural language commands and generate executable plans in various languages. However, in real-time and interactive applications involving mobile robots, particularly drones, the sequential token generation process inherent to LLMs introduces substantial latency, i.e. response time, in control plan generation. In this paper, we present a system called ChatFly that tackles this problem using a combination of a novel programming language called MiniSpec and its runtime to reduce the plan generation time and drone response time. That is, instead of asking an LLM to write a program (robotic plan) in the popular but verbose Python, ChatFly gets it to do it in MiniSpec specially designed for token efficiency and stream interpretation. Using a set of challenging drone tasks, we show that design choices made by ChatFly can reduce up to 62% response time and provide a more consistent user experience, enabling responsive and intelligent LLM-based drone control with efficient completion.
LingoQA: Visual Question Answering for Autonomous Driving ECCV 2024
We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving. The dataset contains 28K unique short video scenarios, and 419K annotations. Evaluating state-of-the-art vision-language models on our benchmark shows that their performance is below human capabilities, with GPT-4V responding truthfully to 59.6% of the questions compared to 96.6% for humans. For evaluation, we propose a truthfulness classifier, called Lingo-Judge, that achieves a 0.95 Spearman correlation coefficient to human evaluations, surpassing existing techniques like METEOR, BLEU, CIDEr, and GPT-4. We establish a baseline vision-language model and run extensive ablation studies to understand its performance. We release our dataset and benchmark as an evaluation platform for vision-language models in autonomous driving.
comment: Accepted to ECCV 2024. Benchmark and dataset are available at https://github.com/wayveai/LingoQA/
An Active Perception Game for Robust Information Gathering
Active perception approaches select future viewpoints by using some estimate of the information gain. An inaccurate estimate can be detrimental in critical situations, e.g., locating a person in distress. However the true information gained can only be calculated post hoc, i.e., after the observation is realized. We present an approach for estimating the discrepancy between the information gain (which is the average over putative future observations) and the true information gain. The key idea is to analyze the mathematical relationship between active perception and the estimation error of the information gain in a game-theoretic setting. Using this, we develop an online estimation approach that achieves sub-linear regret (in the number of time-steps) for the estimation of the true information gain and reduces the sub-optimality of active perception systems. We demonstrate our approach for active perception using a comprehensive set of experiments on: (a) different types of environments, including a quadrotor in a photorealistic simulation, real-world robotic data, and real-world experiments with ground robots exploring indoor and outdoor scenes; (b) different types of robotic perception data; and (c) different map representations. On average, our approach reduces information gain estimation errors by 42%, increases the information gain by 7%, PSNR by 5%, and semantic accuracy (measured as the number of objects that are localized correctly) by 6%. In real-world experiments with a Jackal ground robot, our approach demonstrated complex trajectories to explore occluded regions.
OmniColor: A Global Camera Pose Optimization Approach of LiDAR-360Camera Fusion for Colorizing Point Clouds ICRA
A Colored point cloud, as a simple and efficient 3D representation, has many advantages in various fields, including robotic navigation and scene reconstruction. This representation is now commonly used in 3D reconstruction tasks relying on cameras and LiDARs. However, fusing data from these two types of sensors is poorly performed in many existing frameworks, leading to unsatisfactory mapping results, mainly due to inaccurate camera poses. This paper presents OmniColor, a novel and efficient algorithm to colorize point clouds using an independent 360-degree camera. Given a LiDAR-based point cloud and a sequence of panorama images with initial coarse camera poses, our objective is to jointly optimize the poses of all frames for mapping images onto geometric reconstructions. Our pipeline works in an off-the-shelf manner that does not require any feature extraction or matching process. Instead, we find optimal poses by directly maximizing the photometric consistency of LiDAR maps. In experiments, we show that our method can overcome the severe visual distortion of omnidirectional images and greatly benefit from the wide field of view (FOV) of 360-degree cameras to reconstruct various scenarios with accuracy and stability. The code will be released at https://github.com/liubonan123/OmniColor/.
comment: 2024 IEEE International Conference on Robotics and Automation (ICRA)
Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation
Given the high cost of collecting robotic data in the real world, sample efficiency is a consistently compelling pursuit in robotics. In this paper, we introduce SGRv2, an imitation learning framework that enhances sample efficiency through improved visual and action representations. Central to the design of SGRv2 is the incorporation of a critical inductive bias-action locality, which posits that robot's actions are predominantly influenced by the target object and its interactions with the local environment. Extensive experiments in both simulated and real-world settings demonstrate that action locality is essential for boosting sample efficiency. SGRv2 excels in RLBench tasks with keyframe control using merely 5 demonstrations and surpasses the RVT baseline in 23 of 26 tasks. Furthermore, when evaluated on ManiSkill2 and MimicGen using dense control, SGRv2's success rate is 2.54 times that of SGR. In real-world environments, with only eight demonstrations, SGRv2 can perform a variety of tasks at a markedly higher success rate compared to baseline models. Project website: http://sgrv2-robot.github.io
comment: CoRL 2024. Project website: http://sgrv2-robot.github.io
Gaussian-LIC: Real-Time Photo-Realistic SLAM with Gaussian Splatting and LiDAR-Inertial-Camera Fusion
In this paper, we present a real-time photo-realistic SLAM method based on marrying Gaussian Splatting with LiDAR-Inertial-Camera SLAM. Most existing radiance-field-based SLAM systems mainly focus on bounded indoor environments, equipped with RGB-D or RGB sensors. However, they are prone to decline when expanding to unbounded scenes or encountering adverse conditions, such as violent motions and changing illumination. In contrast, oriented to general scenarios, our approach additionally tightly fuses LiDAR, IMU, and camera for robust pose estimation and photo-realistic online mapping. To compensate for regions unobserved by the LiDAR, we propose to integrate both the triangulated visual points from images and LiDAR points for initializing 3D Gaussians. In addition, the modeling of the sky and varying camera exposure have been realized for high-quality rendering. Notably, we implement our system purely with C++ and CUDA, and meticulously design a series of strategies to accelerate the online optimization of the Gaussian-based scene representation. Extensive experiments demonstrate that our method outperforms its counterparts while maintaining real-time capability. Impressively, regarding photo-realistic mapping, our method with our estimated poses even surpasses all the compared approaches that utilize privileged ground-truth poses for mapping. Our code will be released on project page https://xingxingzuo.github.io/gaussian_lic.
AnoVox: A Benchmark for Multimodal Anomaly Detection in Autonomous Driving ECCV 2024
The scale-up of autonomous vehicles depends heavily on their ability to deal with anomalies, such as rare objects on the road. In order to handle such situations, it is necessary to detect anomalies in the first place. Anomaly detection for autonomous driving has made great progress in the past years but suffers from poorly designed benchmarks with a strong focus on camera data. In this work, we propose AnoVox, the largest benchmark for ANOmaly detection in autonomous driving to date. AnoVox incorporates large-scale multimodal sensor data and spatial VOXel ground truth, allowing for the comparison of methods independent of their used sensor. We propose a formal definition of normality and provide a compliant training dataset. AnoVox is the first benchmark to contain both content and temporal anomalies.
comment: Daniel Bogdoll, Iramm Hamdard, and Lukas Namgyu R\"o{\ss}ler contributed equally. Accepted for publication at ECCV 2024 W-CODA workshop
Humanoid Parkour Learning
Parkour is a grand challenge for legged locomotion, even for quadruped robots, requiring active perception and various maneuvers to overcome multiple challenging obstacles. Existing methods for humanoid locomotion either optimize a trajectory for a single parkour track or train a reinforcement learning policy only to walk with a significant amount of motion references. In this work, we propose a framework for learning an end-to-end vision-based whole-body-control parkour policy for humanoid robots that overcomes multiple parkour skills without any motion prior. Using the parkour policy, the humanoid robot can jump on a 0.42m platform, leap over hurdles, 0.8m gaps, and much more. It can also run at 1.8m/s in the wild and walk robustly on different terrains. We test our policy in indoor and outdoor environments to demonstrate that it can autonomously select parkour skills while following the rotation command of the joystick. We override the arm actions and show that this framework can easily transfer to humanoid mobile manipulation tasks. Videos can be found at https://humanoid4parkour.github.io
comment: Published on CoRL 2024
General-purpose Clothes Manipulation with Semantic Keypoints
Clothes manipulation is a critical skill for household robots. Recent advancements have been made in task-specific clothes manipulation, such as folding, flattening, and hanging. However, due to clothes' complex geometries and deformability, creating a general-purpose robot system that can manipulate a diverse range of clothes in many ways remains challenging. Since clothes are typically designed with specific structures, we propose identifying these specific features like ``left sleeve'' as semantic keypoints. Semantic keypoints can provide semantic cues for task planning and geometric cues for low-level action generation. With this insight, we develop a hierarchical learning framework using the large language model (LLM) for general-purpose CLothes mAnipulation with Semantic keyPoints (CLASP). Extensive simulation experiments show that CLASP outperforms baseline methods on both seen and unseen tasks across various clothes manipulation tasks. Real-world experiments show that CLASP can be directly deployed in the real world and applied to a wide variety of clothes.
Recursive Distillation for Open-Set Distributed Robot Localization
A typical assumption in state-of-the-art self-localization models is that an annotated training dataset is available for the target workspace. However, this is not necessarily true when a robot travels around the general open world. This work introduces a novel training scheme for open-world distributed robot systems. In our scheme, a robot (``student") can ask the other robots it meets at unfamiliar places (``teachers") for guidance. Specifically, a pseudo-training dataset is reconstructed from the teacher model and then used for continual learning of the student model under domain, class, and vocabulary incremental setup. Unlike typical knowledge transfer schemes, our scheme introduces only minimal assumptions on the teacher model, so that it can handle various types of open-set teachers, including those uncooperative, untrainable (e.g., image retrieval engines), or black-box teachers (i.e., data privacy). In this paper, we investigate a ranking function as an instance of such generic models, using a challenging data-free recursive distillation scenario, where a student once trained can recursively join the next-generation open teacher set.
comment: 5 pages, 4 figures, technical report
SliceIt! -- A Dual Simulator Framework for Learning Robot Food Slicing ICRA 2024
Cooking robots can enhance the home experience by reducing the burden of daily chores. However, these robots must perform their tasks dexterously and safely in shared human environments, especially when handling dangerous tools such as kitchen knives. This study focuses on enabling a robot to autonomously and safely learn food-cutting tasks. More specifically, our goal is to enable a collaborative robot or industrial robot arm to perform food-slicing tasks by adapting to varying material properties using compliance control. Our approach involves using Reinforcement Learning (RL) to train a robot to compliantly manipulate a knife, by reducing the contact forces exerted by the food items and by the cutting board. However, training the robot in the real world can be inefficient, and dangerous, and result in a lot of food waste. Therefore, we proposed SliceIt!, a framework for safely and efficiently learning robot food-slicing tasks in simulation. Following a real2sim2real approach, our framework consists of collecting a few real food slicing data, calibrating our dual simulation environment (a high-fidelity cutting simulator and a robotic simulator), learning compliant control policies on the calibrated simulation environment, and finally, deploying the policies on the real robot.
comment: Accepted to ICRA 2024
Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System IROS 2024
Automating dexterous, contact-rich manipulation tasks using rigid robots is a significant challenge in robotics. Rigid robots, defined by their actuation through position commands, face issues of excessive contact forces due to their inability to adapt to contact with the environment, potentially causing damage. While compliance control schemes have been introduced to mitigate these issues by controlling forces via external sensors, they are hampered by the need for fine-tuning task-specific controller parameters. Learning from Demonstrations (LfD) offers an intuitive alternative, allowing robots to learn manipulations through observed actions. In this work, we introduce a novel system to enhance the teaching of dexterous, contact-rich manipulations to rigid robots. Our system is twofold: firstly, it incorporates a teleoperation interface utilizing Virtual Reality (VR) controllers, designed to provide an intuitive and cost-effective method for task demonstration with haptic feedback. Secondly, we present Comp-ACT (Compliance Control via Action Chunking with Transformers), a method that leverages the demonstrations to learn variable compliance control from a few demonstrations. Our methods have been validated across various complex contact-rich manipulation tasks using single-arm and bimanual robot setups in simulated and real-world environments, demonstrating the effectiveness of our system in teaching robots dexterous manipulations with enhanced adaptability and safety. Code available at: https://github.com/omron-sinicx/CompACT
comment: Accepted to IROS 2024
Plant Robots: Harnessing Growth Actuation of Plants for Locomotion and Object Manipulation
Plants display physical displacements during their growth due to photosynthesis, which converts light into chemical energy. This can be interpreted as plants acting as actuators with a built-in power source. This paper presents a method to create plant robots that move and perform tasks by harnessing the actuation output of plants: displacement and force generated from the growing process. As the target plant, radish sprouts are employed, and their displacement and force are characterized, followed by the calculation of power and energy densities. Based on the characterization, two different plant robots are designed and fabricated: a rotational robot and a gripper. The former demonstrates ground locomotion, achieving a travel distance of 14.6 mm with an average speed of 0.8 mm/h. The latter demonstrates the picking and placing of an object with a 0.1-g mass by the light-controlled open-close motion of plant fingers. A good agreement between the experimental and model values is observed in the specific data of the mobile robot, suggesting that obtaining the actuation characteristics of plants can enable the design and prediction of behavior in plant robots. These results pave the way for the realization of novel types of environmentally friendly and sustainable robots.
comment: 16 pages, 4 figures
Systems and Control (CS)
A Sim-to-Real Vision-based Lane Keeping System for a 1:10-scale Autonomous Vehicle
In recent years, several competitions have highlighted the need to investigate vision-based solutions to address scenarios with functional insufficiencies in perception, world modeling and localization. This article presents the Vision-based Lane Keeping System (VbLKS) developed by the DEI-Unipd Team within the context of the Bosch Future Mobility Challenge 2022. The main contribution lies in a Simulation-to-Reality (Sim2Real) GPS-denied VbLKS for a 1:10-scale autonomous vehicle. In this VbLKS, the input to a tailored Pure Pursuit (PP) based control strategy, namely the Lookahead Heading Error (LHE), is estimated at a constant lookahead distance employing a Convolutional Neural Network (CNN). A training strategy for a compact CNN is proposed, emphasizing data generation and augmentation on simulated camera images from a 3D Gazebo simulator, and enabling real-time operation on low-level hardware. A tailored PP-based lateral controller equipped with a derivative action and a PP-based velocity reference generation are implemented. Tuning ranges are established through a systematic time-delay stability analysis. Validation in a representative controlled laboratory setting is provided.
comment: 16 pages, 23 figures
End-to-end guarantees for indirect data-driven control of bilinear systems with finite stochastic data
In this paper we propose an end-to-end algorithm for indirect data-driven control for bilinear systems with stability guarantees. We consider the case where the collected i.i.d. data is affected by probabilistic noise with possibly unbounded support and leverage tools from statistical learning theory to derive finite sample identification error bounds. To this end, we solve the bilinear identification problem by solving a set of linear and affine identification problems, by a particular choice of a control input during the data collection phase. We provide a priori as well as data-dependent finite sample identification error bounds on the individual matrices as well as ellipsoidal bounds, both of which are structurally suitable for control. Further, we integrate the structure of the derived identification error bounds in a robust controller design to obtain an exponentially stable closed-loop. By means of an extensive numerical study we showcase the interplay between the controller design and the derived identification error bounds. Moreover, we note appealing connections of our results to indirect data-driven control of general nonlinear systems through Koopman operator theory and discuss how our results may be applied in this setup.
Control Industrial Automation System with Large Language Models
Traditional industrial automation systems require specialized expertise to operate and complex reprogramming to adapt to new processes. Large language models offer the intelligence to make them more flexible and easier to use. However, LLMs' application in industrial settings is underexplored. This paper introduces a framework for integrating LLMs to achieve end-to-end control of industrial automation systems. At the core of the framework are an agent system designed for industrial tasks, a structured prompting method, and an event-driven information modeling mechanism that provides real-time data for LLM inference. The framework supplies LLMs with real-time events on different context semantic levels, allowing them to interpret the information, generate production plans, and control operations on the automation system. It also supports structured dataset creation for fine-tuning on this downstream application of LLMs. Our contribution includes a formal system design, proof-of-concept implementation, and a method for generating task-specific datasets for LLM fine-tuning and testing. This approach enables a more adaptive automation system that can respond to spontaneous events, while allowing easier operation and configuration through natural language for more intuitive human-machine interaction. We provide demo videos and detailed data on GitHub: https://github.com/YuchenXia/LLM4IAS
Distributed Invariant Unscented Kalman Filter based on Inverse Covariance Intersection with Intermittent Measurements
This paper studies the problem of distributed state estimation (DSE) over sensor networks on matrix Lie groups, which is crucial for applications where system states evolve on Lie groups rather than vector spaces. We propose a diffusion-based distributed invariant Unscented Kalman Filter using the inverse covariance intersection (DIUKF-ICI) method to address target tracking in 3D environments. Unlike existing distributed UKFs confined to vector spaces, our approach extends the distributed UKF framework to Lie groups, enabling local estimates to be fused with intermediate information from neighboring agents on Lie groups. To handle the unknown correlations across local estimates, we extend the ICI fusion strategy to matrix Lie groups for the first time and integrate it into the diffusion algorithm. We demonstrate that the estimation error of the proposed method is bounded. Additionally, the algorithm is fully distributed, robust against intermittent measurements, and adaptable to time-varying communication topologies. The effectiveness of the proposed method is validated through extensive Monte-Carlo simulations.
Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions ECCV 2024
The stark contrast in the design philosophy of an event camera makes it particularly ideal for operating under high-speed, high dynamic range and low-light conditions, where standard cameras underperform. Nonetheless, event cameras still suffer from some amount of motion blur, especially under these challenging conditions, in contrary to what most think. This is attributed to the limited bandwidth of the event sensor pixel, which is mostly proportional to the light intensity. Thus, to ensure that event cameras can truly excel in such conditions where it has an edge over standard cameras, it is crucial to account for event motion blur in downstream applications, especially reconstruction. However, none of the recent works on reconstructing Neural Radiance Fields (NeRFs) from events, nor event simulators, have considered the full effects of event motion blur. To this end, we propose, Deblur e-NeRF, a novel method to directly and effectively reconstruct blur-minimal NeRFs from motion-blurred events generated under high-speed motion or low-light conditions. The core component of this work is a physically-accurate pixel bandwidth model proposed to account for event motion blur under arbitrary speed and lighting conditions. We also introduce a novel threshold-normalized total variation loss to improve the regularization of large textureless patches. Experiments on real and novel realistically simulated sequences verify our effectiveness. Our code, event simulator and synthetic event dataset will be open-sourced.
comment: Accepted to ECCV 2024. Project website is accessible at https://wengflow.github.io/deblur-e-nerf. arXiv admin note: text overlap with arXiv:2006.07722 by other authors
Intelligent Energy Management: Remaining Useful Life Prediction and Charging Automation System Comprised of Deep Learning and the Internet of Things
Remaining Useful Life (RUL) of battery is an important parameter to know the battery's remaining life and need for recharge. The goal of this research project is to develop machine learning-based models for the battery RUL dataset. Different ML models are developed to classify the RUL of the vehicle, and the IoT (Internet of Things) concept is simulated for automating the charging system and managing any faults aligning. The graphs plotted depict the relationship between various vehicle parameters using the Blynk IoT platform. Results show that the catboost, Multi-Layer Perceptron (MLP), Gated Recurrent Unit (GRU), and hybrid model developed could classify RUL into three classes with 99% more accuracy. The data is fed using the tkinter GUI for simulating artificial intelligence (AI)-based charging, and with a pyserial backend, data can be entered into the Esp-32 microcontroller for making charge discharge possible with the model's predictions. Also, with an IoT system, the charging can be disconnected, monitored, and analyzed for automation. The results show that an accuracy of 99% can be obtained on models MLP, catboost model and similar accuracy on GRU model can be obtained, and finally relay-based triggering can be made by prediction through the model used for automating the charging and energy-saving mechanism. By showcasing an exemplary Blynk platform-based monitoring and automation phenomenon, we further present innovative ways of monitoring parameters and automating the system.
Observer-Based Discontinuous Communication in the Secondary Control of AC Microgrids
This paper proposes an observer-based event-driven approach to decrease the overuse of communication networks. The suggested approach aims to estimate the required data for sharing between units in line with as much communication reduction as possible. In other words, the proposed approach effectively determines which state variables should be shared (observer concept) among the units during specific time intervals (event-triggered concept). This strategy significantly reduces the overall communication load. It is shown that the estimation error remains bounded and Zeno behavior, characterized by an endless number of transmissions occurring within a limited time frame, does not occur. The proposed methodology can be systematically applied to any communication-based secondary controller in alternating current (AC) microgrids. Simulation results demonstrate a high degree of precision in estimating the states under the proposed approach. Also, the secondary controller performance under the proposed method is evaluated in MATLAB/Simulink environment.
comment: 2024 IEEE PES Innovative Smart Grid Technologies Europe (ISGT Europe)
PhantomLiDAR: Cross-modality Signal Injection Attacks against LiDAR
LiDAR (Light Detection and Ranging) is a pivotal sensor for autonomous driving, offering precise 3D spatial information. Previous signal attacks against LiDAR systems mainly exploit laser signals. In this paper, we investigate the possibility of cross-modality signal injection attacks, i.e., injecting intentional electromagnetic interference (IEMI) to manipulate LiDAR output. Our insight is that the internal modules of a LiDAR, i.e., the laser receiving circuit, the monitoring sensors, and the beam-steering modules, even with strict electromagnetic compatibility (EMC) testing, can still couple with the IEMI attack signals and result in the malfunction of LiDAR systems. Based on the above attack surfaces, we propose the PhantomLiDAR attack, which manipulates LiDAR output in terms of Points Interference, Points Injection, Points Removal, and even LiDAR Power-Off. We evaluate and demonstrate the effectiveness of PhantomLiDAR with both simulated and real-world experiments on five COTS LiDAR systems. We also conduct feasibility experiments in real-world moving scenarios. We provide potential defense measures that can be implemented at both the sensor level and the vehicle system level to mitigate the risks associated with IEMI attacks. Video demonstrations can be viewed at https://sites.google.com/view/phantomlidar.
Model-Free versus Model-Based Reinforcement Learning for Fixed-Wing UAV Attitude Control Under Varying Wind Conditions
This paper evaluates and compares the performance of model-free and model-based reinforcement learning for the attitude control of fixed-wing unmanned aerial vehicles using PID as a reference point. The comparison focuses on their ability to handle varying flight dynamics and wind disturbances in a simulated environment. Our results show that the Temporal Difference Model Predictive Control agent outperforms both the PID controller and other model-free reinforcement learning methods in terms of tracking accuracy and robustness over different reference difficulties, particularly in nonlinear flight regimes. Furthermore, we introduce actuation fluctuation as a key metric to assess energy efficiency and actuator wear, and we test two different approaches from the literature: action variation penalty and conditioning for action policy smoothness. We also evaluate all control methods when subject to stochastic turbulence and gusts separately, so as to measure their effects on tracking performance, observe their limitations and outline their implications on the Markov decision process formalism.
comment: Published at ICINCO 2024
Discontinuous Reception with Adjustable Inactivity Timer for IIoT
Discontinuous reception (DRX) is a key technology for reducing the energy consumption of industrial Internet of Things (IIoT) devices. Specifically, DRX allows the devices to operate in a low-power mode when no data reception is scheduled, and its effectiveness depends on the proper configuration of the DRX parameters. In this paper, we characterize the DRX process departing from a semi-Markov chain modeling. We detail two ways to set DRX parameters to minimize the device power consumption while meeting a mean delay constraint. The first method exhaustively searches for the optimal configuration. In contrast, the second method uses a low-complexity metaheuristic to find a sub-optimal configuration, thus considering ideal and practical DRX configurations. Notably, within the DRX parameters, the inactivity timer (IT) is a caution time that specifies how long a device remains active after the last information exchange. Traditionally, a device implementing DRX will restart the IT after each data reception as a precedent to a low-power mode. The usual approach lies in restarting the IT whenever new data is received during this cautious period, which might sometimes needlessly extend the active time. Herein, we propose a more efficient method in which the transmit base station (BS) explicitly indicates restarting the timer through the control channel only when appropriate. The decision is taken based on the BS's knowledge about its buffer status. We consider Poisson and bursty traffic models, which are typical in IIoT setups, and verify the suitability of our proposal for reducing the energy consumption of the devices without significantly compromising the communication latency through extensive numerical simulations. Specifically, energy-saving gains of up to 30% can be obtained regardless of the arrival rate and delay constraints.
comment: IEEE Transactions on Industrial Informatics (2024)
Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes
With robots increasingly collaborating with humans in everyday tasks, it is important to take steps toward robotic systems capable of understanding the environment. This work focuses on scene understanding to detect pick and place tasks given initial and final images from the scene. To this end, a dataset is collected for object detection and pick and place task detection. A YOLOv5 network is subsequently trained to detect the objects in the initial and final scenes. Given the detected objects and their bounding boxes, two methods are proposed to detect the pick and place tasks which transform the initial scene into the final scene. A geometric method is proposed which tracks objects' movements in the two scenes and works based on the intersection of the bounding boxes which moved within scenes. Contrarily, the CNN-based method utilizes a Convolutional Neural Network to classify objects with intersected bounding boxes into 5 classes, showing the spatial relationship between the involved objects. The performed pick and place tasks are then derived from analyzing the experiments with both scenes. Results show that the CNN-based method, using a VGG16 backbone, outscores the geometric method by roughly 12 percentage points in certain scenarios, with an overall success rate of 84.3%.
comment: Conference Paper, ICEE 2024, 7 pages, 5 figures
On the Output Redundancy of LTI Systems: A Geometric Approach with Application to Privacy
This paper examines the properties of output-redundant systems, that is, systems possessing a larger number of outputs than inputs, through the lenses of the geometric approach of Wonham et al. We begin by formulating a simple output allocation synthesis problem, which involves ``concealing" input information from a malicious eavesdropper having access to the system output, while still allowing for a legitimate user to reconstruct it. It is shown that the solvability of this problem requires the availability of a redundant set of outputs. This very problem is instrumental to unveiling the fundamental geometric properties of output-redundant systems, which form the basis for our subsequent constructions and results. As a direct application, we demonstrate how output allocation can be employed to effectively protect the information of input information from certain output eavesdroppers with guaranteed results.
Semantic model for the description of energy data in the Module Type Package
Modular production systems that employ the Module Type Package (MTP) to describe module interfaces can, at present, only communicate energy data through proprietary solutions. Due to this limitation, users face additional effort when calculating energy KPIs for modules or determining the energy efficiency of modules. To address this issue, we present a model that facilitates energy data to be described semantically and uniformly in the MTP on the basis of an industrial standard (OPC 34100). MTPs incorporating this model can transmit semantically consistent energy data from modules to the process control system, making the data available for further applications, such as monitoring or optimization.
comment: 6 pages, 4 figures
Stereographic Projection of Probabilistic Frequency-Domain Uncertainty
This paper investigates the stereographic projection of points along the Nyquist plots of single input single output (SISO) linear time invariant (LTI) systems subject to probabilistic uncertainty. At each frequency, there corresponds a complex-valued random variable with given probability distribution in the complex plane. The chordal distance between the stereographic projections of this complex value and the corresponding value for a nominal model, as per the well-known Nu-Gap metric of Vinnicombe, is also a random quantity. The main result provides the cumulative density function (CDF) of the chordal distance at a given frequency. Such a stochastic distance framework opens up a fresh and a fertile research direction on probabilistic robust control theory.
GLinSAT: The General Linear Satisfiability Neural Network Layer By Accelerated Gradient Descent
Ensuring that the outputs of neural networks satisfy specific constraints is crucial for applying neural networks to real-life decision-making problems. In this paper, we consider making a batch of neural network outputs satisfy bounded and general linear constraints. We first reformulate the neural network output projection problem as an entropy-regularized linear programming problem. We show that such a problem can be equivalently transformed into an unconstrained convex optimization problem with Lipschitz continuous gradient according to the duality theorem. Then, based on an accelerated gradient descent algorithm with numerical performance enhancement, we present our architecture, GLinSAT, to solve the problem. To the best of our knowledge, this is the first general linear satisfiability layer in which all the operations are differentiable and matrix-factorization-free. Despite the fact that we can explicitly perform backpropagation based on automatic differentiation mechanism, we also provide an alternative approach in GLinSAT to calculate the derivatives based on implicit differentiation of the optimality condition. Experimental results on constrained traveling salesman problems, partial graph matching with outliers, predictive portfolio allocation and power system unit commitment demonstrate the advantages of GLinSAT over existing satisfiability layers.
Optimal control of stochastic reaction networks with entropic control cost and emergence of mode-switching strategies
Controlling the stochastic dynamics of biological populations is a challenge that arises across various biological contexts. However, these dynamics are inherently nonlinear and involve a discrete state space, i.e., the number of molecules, cells, or organisms. Additionally, the possibility of extinction has a significant impact on both the dynamics and control strategies, particularly when the population size is small. These factors hamper the direct application of conventional control theories to biological systems. To address these challenges, we formulate the optimal control problem for stochastic population dynamics by utilizing a control cost function based on the Kullback-Leibler divergence. This approach naturally accounts for population-specific factors and simplifies the complex nonlinear Hamilton-Jacobi-Bellman equation into a linear form, facilitating efficient computation of optimal solutions. We demonstrate the effectiveness of our approach by applying it to the control of interacting random walkers, Moran processes, and SIR models, and observe the mode-switching phenomena in the control strategies. Our approach provides new opportunities for applying control theory to a wide range of biological problems.
comment: 12 pages, 4 figures
Survey of Moving Target Defense in Power Grids: Design Principles, Tradeoffs, and Future Directions
Moving target defense (MTD) in power grids is an emerging defense technique that has gained prominence in the recent past. It aims to solve the long-standing problem of securing the power grid against stealthy attacks. The key idea behind MTD is to introduce periodic/event-triggered controlled changes to the power grid's SCADA network/physical plant, thereby invalidating the knowledge attackers use for crafting stealthy attacks. In this paper, we provide a comprehensive overview of this topic and classify the different ways in which MTD is implemented in power grids. We further introduce the guiding principles behind the design of MTD, key performance metrics, and the associated trade-offs in MTD and identify the future development of MTD for power grid security.
comment: 10 pages, 3 figures, survey
Multi-platoon car-following models with flexible platoon sizes and communication levels
In this paper, we extend a single platoon car-following (CF) model to some multi-platoon CF models for connected and autonomous vehicles (CAVs) with flexible platoon size and communication level. Specifically, we consider forward and backward communication methods between platoons with delays. Some general results of linear stability are mathematically proven, and numerical simulations are performed to illustrate the effects of platoon sizes and communication levels, as well as to demonstrate the potential for stabilizing human-driven vehicles (HDVs) in mixed traffic conditions. The simulation results are consistent with theoretical analysis, and demonstrate that in the ring road scenario, CAV platoons can stabilize certain percentage of HDVs. This paper can provide suggestions for the design of communication system of autonomous vehicles (AVs), and management of mixed traffic flow of CAVs and HDVs.
comment: Preprint for IEEE
Causality-based Subject and Task Fingerprints using fMRI Time-series Data
Recently, there has been a revived interest in system neuroscience causation models due to their unique capability to unravel complex relationships in multi-scale brain networks. In this paper, our goal is to verify the feasibility and effectiveness of using a causality-based approach for fMRI fingerprinting. Specifically, we propose an innovative method that utilizes the causal dynamics activities of the brain to identify the unique cognitive patterns of individuals (e.g., subject fingerprint) and fMRI tasks (e.g., task fingerprint). The key novelty of our approach stems from the development of a two-timescale linear state-space model to extract 'spatio-temporal' (aka causal) signatures from an individual's fMRI time series data. To the best of our knowledge, we pioneer and subsequently quantify, in this paper, the concept of 'causal fingerprint.' Our method is well-separated from other fingerprint studies as we quantify fingerprints from a cause-and-effect perspective, which are then incorporated with a modal decomposition and projection method to perform subject identification and a GNN-based (Graph Neural Network) model to perform task identification. Finally, we show that the experimental results and comparisons with non-causality-based methods demonstrate the effectiveness of the proposed methods. We visualize the obtained causal signatures and discuss their biological relevance in light of the existing understanding of brain functionalities. Collectively, our work paves the way for further studies on causal fingerprints with potential applications in both healthy controls and neurodegenerative diseases.
Criticality and Safety Margins for Reinforcement Learning
State of the art reinforcement learning methods sometimes encounter unsafe situations. Identifying when these situations occur is of interest both for post-hoc analysis and during deployment, where it might be advantageous to call out to a human overseer for help. Efforts to gauge the criticality of different points in time have been developed, but their accuracy is not well established due to a lack of ground truth, and they are not designed to be easily interpretable by end users. Therefore, we seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality. Safety margins make these interpretable, when defined as the number of random actions for which performance loss will not exceed some tolerance with high confidence. We demonstrate this approach in several environment-agent combinations; for an A3C agent in an Atari Beamrider environment, the lowest 5% of safety margins contain 47% of agent losses; i.e., supervising only 5% of decisions could potentially prevent roughly half of an agent's errors. This criticality framework measures the potential impacts of bad decisions, even before those decisions are made, allowing for more effective debugging and oversight of autonomous agents.
comment: 17 pages, 10 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Optimizing Downlink C-NOMA Transmission with Movable Antennas: A DDPG-based Approach
This paper analyzes a downlink C-NOMA scenario where a base station (BS) is deployed to serve a pair of users equipped with movable antenna (MA) technology. The user with better channel conditions with the BS will be able to transmit the signal to the other user providing an extra transmission resource and enhancing performance. Both users are equipped with a receiving MA each and a transmitting MA for the relaying user. In this regard, we formulate an optimization problem with the objective of maximizing the achievable sum rate by jointly determining the beamforming vector at the BS, the transmit power at the device and the positions of the MAs while meeting the quality of service (QoS) constraints. Due to the non-convex structure of the formulated problem and the randomness in the channels we adopt a deep deterministic policy gradient (DDPG) approach, a reinforcement learning (RL) algorithm capable of dealing with continuous state and action spaces. Numerical results demonstrate the superiority of the presented model compared to the other benchmark schemes showing gains reaching 45% compared to the NOMA enabled MA scheme and 60% compared to C-NOMA model with fixed antennas. The solution approach showed 93% accuracy compared to the optimal solution.
Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions ECCV 2024
The stark contrast in the design philosophy of an event camera makes it particularly ideal for operating under high-speed, high dynamic range and low-light conditions, where standard cameras underperform. Nonetheless, event cameras still suffer from some amount of motion blur, especially under these challenging conditions, in contrary to what most think. This is attributed to the limited bandwidth of the event sensor pixel, which is mostly proportional to the light intensity. Thus, to ensure that event cameras can truly excel in such conditions where it has an edge over standard cameras, it is crucial to account for event motion blur in downstream applications, especially reconstruction. However, none of the recent works on reconstructing Neural Radiance Fields (NeRFs) from events, nor event simulators, have considered the full effects of event motion blur. To this end, we propose, Deblur e-NeRF, a novel method to directly and effectively reconstruct blur-minimal NeRFs from motion-blurred events generated under high-speed motion or low-light conditions. The core component of this work is a physically-accurate pixel bandwidth model proposed to account for event motion blur under arbitrary speed and lighting conditions. We also introduce a novel threshold-normalized total variation loss to improve the regularization of large textureless patches. Experiments on real and novel realistically simulated sequences verify our effectiveness. Our code, event simulator and synthetic event dataset will be open-sourced.
comment: Accepted to ECCV 2024. Project website is accessible at https://wengflow.github.io/deblur-e-nerf
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2025
We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations, so that at runtime it can recover from perturbations outside the training distribution. Additionally, we introduce a novel transformer-based perception encoder that employs multi-view cross-attention and a learned scene query. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing in the CARLA simulator, as well as showing the ability to handle perturbations in both CARLA and NVIDIA's DRIVE Sim.
comment: 7 pages, 6 figures, for ICRA 2025 conference, for associated video file, see https://youtu.be/fO7RZ57gVxk
The Top Manifold Connectedness of Quantum Control Landscapes
The control of quantum systems has been proven to possess trap-free optimization landscapes under the satisfaction of proper assumptions. However, many details of the landscape geometry and their influence on search efficiency still need to be fully understood. This paper numerically explores the path-connectedness of globally optimal control solutions forming the top manifold of the landscape. We randomly sample a plurality of optimal controls in the top manifold to assess the existence of a continuous path at the top of the landscape that connects two arbitrary optimal solutions. It is shown that for different quantum control objectives including state-to-state transition probabilities, observable expectation values and unitary transformations, such a continuous path can be readily found, implying that these top manifolds are fundamentally path-connected. The significance of the latter conjecture lies in seeking locations in the top manifold where an ancillary objective can also be optimized while maintaining the full optimality of the original objective that defined the landscape.
comment: 34 pages, 10 figures
Safe stabilization using generalized Lyapunov barrier function
This paper addresses the safe stabilization problem, focusing on controlling the system state to the origin while avoiding entry into unsafe state sets. The current methods for solving this issue rely on smooth Lyapunov and barrier functions, which do not always ensure the existence of an effective controller even when such smooth functions are created. To tackle this challenge, we introduce the concept of a generalized (nonsmooth) Lyapunov barrier function (GenLBF), which guarantees the existence of a safe and stable controller. We outline a systematic approach for constructing a GenLBF, including a technique for efficiently calculating the upper generalized derivative of the GenLBF. Using the constructed GenLBF, we propose a method for certifying safe stabilization of autonomous systems and design a piecewise continuous feedback control to achieve safe stabilization of non-autonomous systems. A general controller refinement strategy is further proposed to help the state trajectory escape from undesired local points occurring in systems with special physical structure. A thorough theoretical analysis demonstrates the effectiveness of our method in addressing the safe stabilization problem for systems with single or multiple bounded unsafe state sets. Extensive simulations of linear and nonlinear systems further illustrate the efficacy of the proposed method and its superiority over the smooth control Lyapunov barrier function method.
comment: 19 pages, 14 figures, under review by a journal
Network-aware Recommender System via Online Feedback Optimization
Personalized content on social platforms can exacerbate negative phenomena such as polarization, partly due to the feedback interactions between recommendations and the users. In this paper, we present a control-theoretic recommender system that explicitly accounts for this feedback loop to mitigate polarization. Our approach extends online feedback optimization - a control paradigm for steady-state optimization of dynamical systems - to develop a recommender system that trades off users engagement and polarization reduction, while relying solely on online click data. We establish theoretical guarantees for optimality and stability of the proposed design and validate its effectiveness via numerical experiments with a user population governed by Friedkin-Johnsen dynamics. Our results show these "network-aware" recommendations can significantly reduce polarization while maintaining high levels of user engagement.
Data-based approaches to learning and control by similarity between heterogeneous systems
This paper proposes basic definitions of similarity and similarity indexes between admissible behaviors of heterogeneous host and guest systems and further presents a similarity-based learning control framework by exploiting the offline sampled data. By exploring helpful geometric properties of the admissible behavior and decomposing it into the subspace and offset components, the similarity indexes between two admissible behaviors are defined as the principal angles between their corresponding subspace components. By reconstructing the admissible behaviors leveraging sampled data, an efficient strategy for calculating the similarity indexes is developed, based on which a similarity-based learning control framework is proposed. It is shown that, with the application of similarity-based learning control, the host system can directly accomplish the same control tasks by utilizing the successful experience provided by the guest system, without having to undergo the trial-and-error process. All results in this paper are supported by simulation examples.
Data-Driven Abstractions for Control Systems via Random Exploration
At the intersection of dynamical systems, control theory, and formal methods lies the construction of symbolic abstractions: these typically represent simpler, finite-state models whose behavior mimics that of an underlying concrete system but are easier to analyse. Building an abstraction usually requires an accurate knowledge of the underlying model: this knowledge may be costly to gather, especially in real-life applications. We aim to bridge this gap by building abstractions based on sampling finite length trajectories. To refine a controller built for the abstraction to one for the concrete system, we newly define a notion of probabilistic alternating simulation, and provide Probably Approximately Correct (PAC) guarantees that the constructed abstraction includes all behaviors of the concrete system and that it is suitable for control design, for arbitrarily long time horizons, leveraging scenario theory. Our method is then tested on several numerical benchmarks.
Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method
Inverted pendulums constitute one of the popular systems for benchmarking control algorithms. Several methods have been proposed for the control of this system, the majority of which rely on the availability of a mathematical model. However, deriving a mathematical model using physical parameters or system identification techniques requires manual effort. Moreover, the designed controllers may perform poorly if system parameters change. To mitigate these problems, recently, some studies used Reinforcement Learning (RL) based approaches for the control of inverted pendulum systems. Unfortunately, these methods suffer from slow convergence and local minimum problems. Moreover, they may require hyperparameter tuning which complicates the design process significantly. To alleviate these problems, the present study proposes an LQR-based RL method for adaptive balancing control of an inverted pendulum. As shown by numerical experiments, the algorithm stabilizes the system very fast without requiring a mathematical model or extensive hyperparameter tuning. In addition, it can adapt to parametric changes online.
Convection-Enabled Boundary Control of a 2D Channel Flow
Nonlinear convection, the source of turbulence in fluid flows, may hold the key to stabilizing turbulence by solving a specific cubic polynomial equation. We consider the incompressible Navier-Stokes equations in a two-dimensional channel. The tangential and normal velocities are assumed to be periodic in the streamwise direction. The pressure difference between the left and right ends of the channel is constant. Moreover, we consider no-slip boundary conditions, that is, zero tangential velocity, at the top and bottom walls of the channel, and normal velocity actuation at the top and bottom walls. We design the boundary control inputs to achieve global exponential stabilization, in the L2 sense, of a chosen Poiseuille equilibrium profile for an arbitrarily large Reynolds number. The key idea behind our approach is to select the boundary controllers such that they have zero spatial mean (to guarantee mass conservation) but non-zero spatial cubic mean. We reveal that, because of convection, the time derivative of the L2 energy of the regulation error is a cubic polynomial in the cubic mean of the boundary inputs. Regulation is then achieved by solving a specific cubic equation, using the Cardano root formula. The results are illustrated via a numerical example.
comment: To be presented at the 63rd IEEE Conference on Decision and Control (CDC 2024)
Distributed Quasi-Newton Method for Multi-Agent Optimization
We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.
Systems and Control (EESS)
A Sim-to-Real Vision-based Lane Keeping System for a 1:10-scale Autonomous Vehicle
In recent years, several competitions have highlighted the need to investigate vision-based solutions to address scenarios with functional insufficiencies in perception, world modeling and localization. This article presents the Vision-based Lane Keeping System (VbLKS) developed by the DEI-Unipd Team within the context of the Bosch Future Mobility Challenge 2022. The main contribution lies in a Simulation-to-Reality (Sim2Real) GPS-denied VbLKS for a 1:10-scale autonomous vehicle. In this VbLKS, the input to a tailored Pure Pursuit (PP) based control strategy, namely the Lookahead Heading Error (LHE), is estimated at a constant lookahead distance employing a Convolutional Neural Network (CNN). A training strategy for a compact CNN is proposed, emphasizing data generation and augmentation on simulated camera images from a 3D Gazebo simulator, and enabling real-time operation on low-level hardware. A tailored PP-based lateral controller equipped with a derivative action and a PP-based velocity reference generation are implemented. Tuning ranges are established through a systematic time-delay stability analysis. Validation in a representative controlled laboratory setting is provided.
comment: 16 pages, 23 figures
End-to-end guarantees for indirect data-driven control of bilinear systems with finite stochastic data
In this paper we propose an end-to-end algorithm for indirect data-driven control for bilinear systems with stability guarantees. We consider the case where the collected i.i.d. data is affected by probabilistic noise with possibly unbounded support and leverage tools from statistical learning theory to derive finite sample identification error bounds. To this end, we solve the bilinear identification problem by solving a set of linear and affine identification problems, by a particular choice of a control input during the data collection phase. We provide a priori as well as data-dependent finite sample identification error bounds on the individual matrices as well as ellipsoidal bounds, both of which are structurally suitable for control. Further, we integrate the structure of the derived identification error bounds in a robust controller design to obtain an exponentially stable closed-loop. By means of an extensive numerical study we showcase the interplay between the controller design and the derived identification error bounds. Moreover, we note appealing connections of our results to indirect data-driven control of general nonlinear systems through Koopman operator theory and discuss how our results may be applied in this setup.
Control Industrial Automation System with Large Language Models
Traditional industrial automation systems require specialized expertise to operate and complex reprogramming to adapt to new processes. Large language models offer the intelligence to make them more flexible and easier to use. However, LLMs' application in industrial settings is underexplored. This paper introduces a framework for integrating LLMs to achieve end-to-end control of industrial automation systems. At the core of the framework are an agent system designed for industrial tasks, a structured prompting method, and an event-driven information modeling mechanism that provides real-time data for LLM inference. The framework supplies LLMs with real-time events on different context semantic levels, allowing them to interpret the information, generate production plans, and control operations on the automation system. It also supports structured dataset creation for fine-tuning on this downstream application of LLMs. Our contribution includes a formal system design, proof-of-concept implementation, and a method for generating task-specific datasets for LLM fine-tuning and testing. This approach enables a more adaptive automation system that can respond to spontaneous events, while allowing easier operation and configuration through natural language for more intuitive human-machine interaction. We provide demo videos and detailed data on GitHub: https://github.com/YuchenXia/LLM4IAS
Distributed Invariant Unscented Kalman Filter based on Inverse Covariance Intersection with Intermittent Measurements
This paper studies the problem of distributed state estimation (DSE) over sensor networks on matrix Lie groups, which is crucial for applications where system states evolve on Lie groups rather than vector spaces. We propose a diffusion-based distributed invariant Unscented Kalman Filter using the inverse covariance intersection (DIUKF-ICI) method to address target tracking in 3D environments. Unlike existing distributed UKFs confined to vector spaces, our approach extends the distributed UKF framework to Lie groups, enabling local estimates to be fused with intermediate information from neighboring agents on Lie groups. To handle the unknown correlations across local estimates, we extend the ICI fusion strategy to matrix Lie groups for the first time and integrate it into the diffusion algorithm. We demonstrate that the estimation error of the proposed method is bounded. Additionally, the algorithm is fully distributed, robust against intermittent measurements, and adaptable to time-varying communication topologies. The effectiveness of the proposed method is validated through extensive Monte-Carlo simulations.
Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions ECCV 2024
The stark contrast in the design philosophy of an event camera makes it particularly ideal for operating under high-speed, high dynamic range and low-light conditions, where standard cameras underperform. Nonetheless, event cameras still suffer from some amount of motion blur, especially under these challenging conditions, in contrary to what most think. This is attributed to the limited bandwidth of the event sensor pixel, which is mostly proportional to the light intensity. Thus, to ensure that event cameras can truly excel in such conditions where it has an edge over standard cameras, it is crucial to account for event motion blur in downstream applications, especially reconstruction. However, none of the recent works on reconstructing Neural Radiance Fields (NeRFs) from events, nor event simulators, have considered the full effects of event motion blur. To this end, we propose, Deblur e-NeRF, a novel method to directly and effectively reconstruct blur-minimal NeRFs from motion-blurred events generated under high-speed motion or low-light conditions. The core component of this work is a physically-accurate pixel bandwidth model proposed to account for event motion blur under arbitrary speed and lighting conditions. We also introduce a novel threshold-normalized total variation loss to improve the regularization of large textureless patches. Experiments on real and novel realistically simulated sequences verify our effectiveness. Our code, event simulator and synthetic event dataset will be open-sourced.
comment: Accepted to ECCV 2024. Project website is accessible at https://wengflow.github.io/deblur-e-nerf. arXiv admin note: text overlap with arXiv:2006.07722 by other authors
Intelligent Energy Management: Remaining Useful Life Prediction and Charging Automation System Comprised of Deep Learning and the Internet of Things
Remaining Useful Life (RUL) of battery is an important parameter to know the battery's remaining life and need for recharge. The goal of this research project is to develop machine learning-based models for the battery RUL dataset. Different ML models are developed to classify the RUL of the vehicle, and the IoT (Internet of Things) concept is simulated for automating the charging system and managing any faults aligning. The graphs plotted depict the relationship between various vehicle parameters using the Blynk IoT platform. Results show that the catboost, Multi-Layer Perceptron (MLP), Gated Recurrent Unit (GRU), and hybrid model developed could classify RUL into three classes with 99% more accuracy. The data is fed using the tkinter GUI for simulating artificial intelligence (AI)-based charging, and with a pyserial backend, data can be entered into the Esp-32 microcontroller for making charge discharge possible with the model's predictions. Also, with an IoT system, the charging can be disconnected, monitored, and analyzed for automation. The results show that an accuracy of 99% can be obtained on models MLP, catboost model and similar accuracy on GRU model can be obtained, and finally relay-based triggering can be made by prediction through the model used for automating the charging and energy-saving mechanism. By showcasing an exemplary Blynk platform-based monitoring and automation phenomenon, we further present innovative ways of monitoring parameters and automating the system.
Observer-Based Discontinuous Communication in the Secondary Control of AC Microgrids
This paper proposes an observer-based event-driven approach to decrease the overuse of communication networks. The suggested approach aims to estimate the required data for sharing between units in line with as much communication reduction as possible. In other words, the proposed approach effectively determines which state variables should be shared (observer concept) among the units during specific time intervals (event-triggered concept). This strategy significantly reduces the overall communication load. It is shown that the estimation error remains bounded and Zeno behavior, characterized by an endless number of transmissions occurring within a limited time frame, does not occur. The proposed methodology can be systematically applied to any communication-based secondary controller in alternating current (AC) microgrids. Simulation results demonstrate a high degree of precision in estimating the states under the proposed approach. Also, the secondary controller performance under the proposed method is evaluated in MATLAB/Simulink environment.
comment: 2024 IEEE PES Innovative Smart Grid Technologies Europe (ISGT Europe)
PhantomLiDAR: Cross-modality Signal Injection Attacks against LiDAR
LiDAR (Light Detection and Ranging) is a pivotal sensor for autonomous driving, offering precise 3D spatial information. Previous signal attacks against LiDAR systems mainly exploit laser signals. In this paper, we investigate the possibility of cross-modality signal injection attacks, i.e., injecting intentional electromagnetic interference (IEMI) to manipulate LiDAR output. Our insight is that the internal modules of a LiDAR, i.e., the laser receiving circuit, the monitoring sensors, and the beam-steering modules, even with strict electromagnetic compatibility (EMC) testing, can still couple with the IEMI attack signals and result in the malfunction of LiDAR systems. Based on the above attack surfaces, we propose the PhantomLiDAR attack, which manipulates LiDAR output in terms of Points Interference, Points Injection, Points Removal, and even LiDAR Power-Off. We evaluate and demonstrate the effectiveness of PhantomLiDAR with both simulated and real-world experiments on five COTS LiDAR systems. We also conduct feasibility experiments in real-world moving scenarios. We provide potential defense measures that can be implemented at both the sensor level and the vehicle system level to mitigate the risks associated with IEMI attacks. Video demonstrations can be viewed at https://sites.google.com/view/phantomlidar.
Model-Free versus Model-Based Reinforcement Learning for Fixed-Wing UAV Attitude Control Under Varying Wind Conditions
This paper evaluates and compares the performance of model-free and model-based reinforcement learning for the attitude control of fixed-wing unmanned aerial vehicles using PID as a reference point. The comparison focuses on their ability to handle varying flight dynamics and wind disturbances in a simulated environment. Our results show that the Temporal Difference Model Predictive Control agent outperforms both the PID controller and other model-free reinforcement learning methods in terms of tracking accuracy and robustness over different reference difficulties, particularly in nonlinear flight regimes. Furthermore, we introduce actuation fluctuation as a key metric to assess energy efficiency and actuator wear, and we test two different approaches from the literature: action variation penalty and conditioning for action policy smoothness. We also evaluate all control methods when subject to stochastic turbulence and gusts separately, so as to measure their effects on tracking performance, observe their limitations and outline their implications on the Markov decision process formalism.
comment: Published at ICINCO 2024
Discontinuous Reception with Adjustable Inactivity Timer for IIoT
Discontinuous reception (DRX) is a key technology for reducing the energy consumption of industrial Internet of Things (IIoT) devices. Specifically, DRX allows the devices to operate in a low-power mode when no data reception is scheduled, and its effectiveness depends on the proper configuration of the DRX parameters. In this paper, we characterize the DRX process departing from a semi-Markov chain modeling. We detail two ways to set DRX parameters to minimize the device power consumption while meeting a mean delay constraint. The first method exhaustively searches for the optimal configuration. In contrast, the second method uses a low-complexity metaheuristic to find a sub-optimal configuration, thus considering ideal and practical DRX configurations. Notably, within the DRX parameters, the inactivity timer (IT) is a caution time that specifies how long a device remains active after the last information exchange. Traditionally, a device implementing DRX will restart the IT after each data reception as a precedent to a low-power mode. The usual approach lies in restarting the IT whenever new data is received during this cautious period, which might sometimes needlessly extend the active time. Herein, we propose a more efficient method in which the transmit base station (BS) explicitly indicates restarting the timer through the control channel only when appropriate. The decision is taken based on the BS's knowledge about its buffer status. We consider Poisson and bursty traffic models, which are typical in IIoT setups, and verify the suitability of our proposal for reducing the energy consumption of the devices without significantly compromising the communication latency through extensive numerical simulations. Specifically, energy-saving gains of up to 30% can be obtained regardless of the arrival rate and delay constraints.
comment: IEEE Transactions on Industrial Informatics (2024)
Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes
With robots increasingly collaborating with humans in everyday tasks, it is important to take steps toward robotic systems capable of understanding the environment. This work focuses on scene understanding to detect pick and place tasks given initial and final images from the scene. To this end, a dataset is collected for object detection and pick and place task detection. A YOLOv5 network is subsequently trained to detect the objects in the initial and final scenes. Given the detected objects and their bounding boxes, two methods are proposed to detect the pick and place tasks which transform the initial scene into the final scene. A geometric method is proposed which tracks objects' movements in the two scenes and works based on the intersection of the bounding boxes which moved within scenes. Contrarily, the CNN-based method utilizes a Convolutional Neural Network to classify objects with intersected bounding boxes into 5 classes, showing the spatial relationship between the involved objects. The performed pick and place tasks are then derived from analyzing the experiments with both scenes. Results show that the CNN-based method, using a VGG16 backbone, outscores the geometric method by roughly 12 percentage points in certain scenarios, with an overall success rate of 84.3%.
comment: Conference Paper, ICEE 2024, 7 pages, 5 figures
On the Output Redundancy of LTI Systems: A Geometric Approach with Application to Privacy
This paper examines the properties of output-redundant systems, that is, systems possessing a larger number of outputs than inputs, through the lenses of the geometric approach of Wonham et al. We begin by formulating a simple output allocation synthesis problem, which involves ``concealing" input information from a malicious eavesdropper having access to the system output, while still allowing for a legitimate user to reconstruct it. It is shown that the solvability of this problem requires the availability of a redundant set of outputs. This very problem is instrumental to unveiling the fundamental geometric properties of output-redundant systems, which form the basis for our subsequent constructions and results. As a direct application, we demonstrate how output allocation can be employed to effectively protect the information of input information from certain output eavesdroppers with guaranteed results.
Semantic model for the description of energy data in the Module Type Package
Modular production systems that employ the Module Type Package (MTP) to describe module interfaces can, at present, only communicate energy data through proprietary solutions. Due to this limitation, users face additional effort when calculating energy KPIs for modules or determining the energy efficiency of modules. To address this issue, we present a model that facilitates energy data to be described semantically and uniformly in the MTP on the basis of an industrial standard (OPC 34100). MTPs incorporating this model can transmit semantically consistent energy data from modules to the process control system, making the data available for further applications, such as monitoring or optimization.
comment: 6 pages, 4 figures
Stereographic Projection of Probabilistic Frequency-Domain Uncertainty
This paper investigates the stereographic projection of points along the Nyquist plots of single input single output (SISO) linear time invariant (LTI) systems subject to probabilistic uncertainty. At each frequency, there corresponds a complex-valued random variable with given probability distribution in the complex plane. The chordal distance between the stereographic projections of this complex value and the corresponding value for a nominal model, as per the well-known Nu-Gap metric of Vinnicombe, is also a random quantity. The main result provides the cumulative density function (CDF) of the chordal distance at a given frequency. Such a stochastic distance framework opens up a fresh and a fertile research direction on probabilistic robust control theory.
GLinSAT: The General Linear Satisfiability Neural Network Layer By Accelerated Gradient Descent
Ensuring that the outputs of neural networks satisfy specific constraints is crucial for applying neural networks to real-life decision-making problems. In this paper, we consider making a batch of neural network outputs satisfy bounded and general linear constraints. We first reformulate the neural network output projection problem as an entropy-regularized linear programming problem. We show that such a problem can be equivalently transformed into an unconstrained convex optimization problem with Lipschitz continuous gradient according to the duality theorem. Then, based on an accelerated gradient descent algorithm with numerical performance enhancement, we present our architecture, GLinSAT, to solve the problem. To the best of our knowledge, this is the first general linear satisfiability layer in which all the operations are differentiable and matrix-factorization-free. Despite the fact that we can explicitly perform backpropagation based on automatic differentiation mechanism, we also provide an alternative approach in GLinSAT to calculate the derivatives based on implicit differentiation of the optimality condition. Experimental results on constrained traveling salesman problems, partial graph matching with outliers, predictive portfolio allocation and power system unit commitment demonstrate the advantages of GLinSAT over existing satisfiability layers.
Optimal control of stochastic reaction networks with entropic control cost and emergence of mode-switching strategies
Controlling the stochastic dynamics of biological populations is a challenge that arises across various biological contexts. However, these dynamics are inherently nonlinear and involve a discrete state space, i.e., the number of molecules, cells, or organisms. Additionally, the possibility of extinction has a significant impact on both the dynamics and control strategies, particularly when the population size is small. These factors hamper the direct application of conventional control theories to biological systems. To address these challenges, we formulate the optimal control problem for stochastic population dynamics by utilizing a control cost function based on the Kullback-Leibler divergence. This approach naturally accounts for population-specific factors and simplifies the complex nonlinear Hamilton-Jacobi-Bellman equation into a linear form, facilitating efficient computation of optimal solutions. We demonstrate the effectiveness of our approach by applying it to the control of interacting random walkers, Moran processes, and SIR models, and observe the mode-switching phenomena in the control strategies. Our approach provides new opportunities for applying control theory to a wide range of biological problems.
comment: 12 pages, 4 figures
Survey of Moving Target Defense in Power Grids: Design Principles, Tradeoffs, and Future Directions
Moving target defense (MTD) in power grids is an emerging defense technique that has gained prominence in the recent past. It aims to solve the long-standing problem of securing the power grid against stealthy attacks. The key idea behind MTD is to introduce periodic/event-triggered controlled changes to the power grid's SCADA network/physical plant, thereby invalidating the knowledge attackers use for crafting stealthy attacks. In this paper, we provide a comprehensive overview of this topic and classify the different ways in which MTD is implemented in power grids. We further introduce the guiding principles behind the design of MTD, key performance metrics, and the associated trade-offs in MTD and identify the future development of MTD for power grid security.
comment: 10 pages, 3 figures, survey
Multi-platoon car-following models with flexible platoon sizes and communication levels
In this paper, we extend a single platoon car-following (CF) model to some multi-platoon CF models for connected and autonomous vehicles (CAVs) with flexible platoon size and communication level. Specifically, we consider forward and backward communication methods between platoons with delays. Some general results of linear stability are mathematically proven, and numerical simulations are performed to illustrate the effects of platoon sizes and communication levels, as well as to demonstrate the potential for stabilizing human-driven vehicles (HDVs) in mixed traffic conditions. The simulation results are consistent with theoretical analysis, and demonstrate that in the ring road scenario, CAV platoons can stabilize certain percentage of HDVs. This paper can provide suggestions for the design of communication system of autonomous vehicles (AVs), and management of mixed traffic flow of CAVs and HDVs.
comment: Preprint for IEEE
Causality-based Subject and Task Fingerprints using fMRI Time-series Data
Recently, there has been a revived interest in system neuroscience causation models due to their unique capability to unravel complex relationships in multi-scale brain networks. In this paper, our goal is to verify the feasibility and effectiveness of using a causality-based approach for fMRI fingerprinting. Specifically, we propose an innovative method that utilizes the causal dynamics activities of the brain to identify the unique cognitive patterns of individuals (e.g., subject fingerprint) and fMRI tasks (e.g., task fingerprint). The key novelty of our approach stems from the development of a two-timescale linear state-space model to extract 'spatio-temporal' (aka causal) signatures from an individual's fMRI time series data. To the best of our knowledge, we pioneer and subsequently quantify, in this paper, the concept of 'causal fingerprint.' Our method is well-separated from other fingerprint studies as we quantify fingerprints from a cause-and-effect perspective, which are then incorporated with a modal decomposition and projection method to perform subject identification and a GNN-based (Graph Neural Network) model to perform task identification. Finally, we show that the experimental results and comparisons with non-causality-based methods demonstrate the effectiveness of the proposed methods. We visualize the obtained causal signatures and discuss their biological relevance in light of the existing understanding of brain functionalities. Collectively, our work paves the way for further studies on causal fingerprints with potential applications in both healthy controls and neurodegenerative diseases.
Criticality and Safety Margins for Reinforcement Learning
State of the art reinforcement learning methods sometimes encounter unsafe situations. Identifying when these situations occur is of interest both for post-hoc analysis and during deployment, where it might be advantageous to call out to a human overseer for help. Efforts to gauge the criticality of different points in time have been developed, but their accuracy is not well established due to a lack of ground truth, and they are not designed to be easily interpretable by end users. Therefore, we seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality. Safety margins make these interpretable, when defined as the number of random actions for which performance loss will not exceed some tolerance with high confidence. We demonstrate this approach in several environment-agent combinations; for an A3C agent in an Atari Beamrider environment, the lowest 5% of safety margins contain 47% of agent losses; i.e., supervising only 5% of decisions could potentially prevent roughly half of an agent's errors. This criticality framework measures the potential impacts of bad decisions, even before those decisions are made, allowing for more effective debugging and oversight of autonomous agents.
comment: 17 pages, 10 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Optimizing Downlink C-NOMA Transmission with Movable Antennas: A DDPG-based Approach
This paper analyzes a downlink C-NOMA scenario where a base station (BS) is deployed to serve a pair of users equipped with movable antenna (MA) technology. The user with better channel conditions with the BS will be able to transmit the signal to the other user providing an extra transmission resource and enhancing performance. Both users are equipped with a receiving MA each and a transmitting MA for the relaying user. In this regard, we formulate an optimization problem with the objective of maximizing the achievable sum rate by jointly determining the beamforming vector at the BS, the transmit power at the device and the positions of the MAs while meeting the quality of service (QoS) constraints. Due to the non-convex structure of the formulated problem and the randomness in the channels we adopt a deep deterministic policy gradient (DDPG) approach, a reinforcement learning (RL) algorithm capable of dealing with continuous state and action spaces. Numerical results demonstrate the superiority of the presented model compared to the other benchmark schemes showing gains reaching 45% compared to the NOMA enabled MA scheme and 60% compared to C-NOMA model with fixed antennas. The solution approach showed 93% accuracy compared to the optimal solution.
Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions ECCV 2024
The stark contrast in the design philosophy of an event camera makes it particularly ideal for operating under high-speed, high dynamic range and low-light conditions, where standard cameras underperform. Nonetheless, event cameras still suffer from some amount of motion blur, especially under these challenging conditions, in contrary to what most think. This is attributed to the limited bandwidth of the event sensor pixel, which is mostly proportional to the light intensity. Thus, to ensure that event cameras can truly excel in such conditions where it has an edge over standard cameras, it is crucial to account for event motion blur in downstream applications, especially reconstruction. However, none of the recent works on reconstructing Neural Radiance Fields (NeRFs) from events, nor event simulators, have considered the full effects of event motion blur. To this end, we propose, Deblur e-NeRF, a novel method to directly and effectively reconstruct blur-minimal NeRFs from motion-blurred events generated under high-speed motion or low-light conditions. The core component of this work is a physically-accurate pixel bandwidth model proposed to account for event motion blur under arbitrary speed and lighting conditions. We also introduce a novel threshold-normalized total variation loss to improve the regularization of large textureless patches. Experiments on real and novel realistically simulated sequences verify our effectiveness. Our code, event simulator and synthetic event dataset will be open-sourced.
comment: Accepted to ECCV 2024. Project website is accessible at https://wengflow.github.io/deblur-e-nerf
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2025
We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations, so that at runtime it can recover from perturbations outside the training distribution. Additionally, we introduce a novel transformer-based perception encoder that employs multi-view cross-attention and a learned scene query. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing in the CARLA simulator, as well as showing the ability to handle perturbations in both CARLA and NVIDIA's DRIVE Sim.
comment: 7 pages, 6 figures, for ICRA 2025 conference, for associated video file, see https://youtu.be/fO7RZ57gVxk
The Top Manifold Connectedness of Quantum Control Landscapes
The control of quantum systems has been proven to possess trap-free optimization landscapes under the satisfaction of proper assumptions. However, many details of the landscape geometry and their influence on search efficiency still need to be fully understood. This paper numerically explores the path-connectedness of globally optimal control solutions forming the top manifold of the landscape. We randomly sample a plurality of optimal controls in the top manifold to assess the existence of a continuous path at the top of the landscape that connects two arbitrary optimal solutions. It is shown that for different quantum control objectives including state-to-state transition probabilities, observable expectation values and unitary transformations, such a continuous path can be readily found, implying that these top manifolds are fundamentally path-connected. The significance of the latter conjecture lies in seeking locations in the top manifold where an ancillary objective can also be optimized while maintaining the full optimality of the original objective that defined the landscape.
comment: 34 pages, 10 figures
Safe stabilization using generalized Lyapunov barrier function
This paper addresses the safe stabilization problem, focusing on controlling the system state to the origin while avoiding entry into unsafe state sets. The current methods for solving this issue rely on smooth Lyapunov and barrier functions, which do not always ensure the existence of an effective controller even when such smooth functions are created. To tackle this challenge, we introduce the concept of a generalized (nonsmooth) Lyapunov barrier function (GenLBF), which guarantees the existence of a safe and stable controller. We outline a systematic approach for constructing a GenLBF, including a technique for efficiently calculating the upper generalized derivative of the GenLBF. Using the constructed GenLBF, we propose a method for certifying safe stabilization of autonomous systems and design a piecewise continuous feedback control to achieve safe stabilization of non-autonomous systems. A general controller refinement strategy is further proposed to help the state trajectory escape from undesired local points occurring in systems with special physical structure. A thorough theoretical analysis demonstrates the effectiveness of our method in addressing the safe stabilization problem for systems with single or multiple bounded unsafe state sets. Extensive simulations of linear and nonlinear systems further illustrate the efficacy of the proposed method and its superiority over the smooth control Lyapunov barrier function method.
comment: 19 pages, 14 figures, under review by a journal
Network-aware Recommender System via Online Feedback Optimization
Personalized content on social platforms can exacerbate negative phenomena such as polarization, partly due to the feedback interactions between recommendations and the users. In this paper, we present a control-theoretic recommender system that explicitly accounts for this feedback loop to mitigate polarization. Our approach extends online feedback optimization - a control paradigm for steady-state optimization of dynamical systems - to develop a recommender system that trades off users engagement and polarization reduction, while relying solely on online click data. We establish theoretical guarantees for optimality and stability of the proposed design and validate its effectiveness via numerical experiments with a user population governed by Friedkin-Johnsen dynamics. Our results show these "network-aware" recommendations can significantly reduce polarization while maintaining high levels of user engagement.
Data-based approaches to learning and control by similarity between heterogeneous systems
This paper proposes basic definitions of similarity and similarity indexes between admissible behaviors of heterogeneous host and guest systems and further presents a similarity-based learning control framework by exploiting the offline sampled data. By exploring helpful geometric properties of the admissible behavior and decomposing it into the subspace and offset components, the similarity indexes between two admissible behaviors are defined as the principal angles between their corresponding subspace components. By reconstructing the admissible behaviors leveraging sampled data, an efficient strategy for calculating the similarity indexes is developed, based on which a similarity-based learning control framework is proposed. It is shown that, with the application of similarity-based learning control, the host system can directly accomplish the same control tasks by utilizing the successful experience provided by the guest system, without having to undergo the trial-and-error process. All results in this paper are supported by simulation examples.
Data-Driven Abstractions for Control Systems via Random Exploration
At the intersection of dynamical systems, control theory, and formal methods lies the construction of symbolic abstractions: these typically represent simpler, finite-state models whose behavior mimics that of an underlying concrete system but are easier to analyse. Building an abstraction usually requires an accurate knowledge of the underlying model: this knowledge may be costly to gather, especially in real-life applications. We aim to bridge this gap by building abstractions based on sampling finite length trajectories. To refine a controller built for the abstraction to one for the concrete system, we newly define a notion of probabilistic alternating simulation, and provide Probably Approximately Correct (PAC) guarantees that the constructed abstraction includes all behaviors of the concrete system and that it is suitable for control design, for arbitrarily long time horizons, leveraging scenario theory. Our method is then tested on several numerical benchmarks.
Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method
Inverted pendulums constitute one of the popular systems for benchmarking control algorithms. Several methods have been proposed for the control of this system, the majority of which rely on the availability of a mathematical model. However, deriving a mathematical model using physical parameters or system identification techniques requires manual effort. Moreover, the designed controllers may perform poorly if system parameters change. To mitigate these problems, recently, some studies used Reinforcement Learning (RL) based approaches for the control of inverted pendulum systems. Unfortunately, these methods suffer from slow convergence and local minimum problems. Moreover, they may require hyperparameter tuning which complicates the design process significantly. To alleviate these problems, the present study proposes an LQR-based RL method for adaptive balancing control of an inverted pendulum. As shown by numerical experiments, the algorithm stabilizes the system very fast without requiring a mathematical model or extensive hyperparameter tuning. In addition, it can adapt to parametric changes online.
Convection-Enabled Boundary Control of a 2D Channel Flow
Nonlinear convection, the source of turbulence in fluid flows, may hold the key to stabilizing turbulence by solving a specific cubic polynomial equation. We consider the incompressible Navier-Stokes equations in a two-dimensional channel. The tangential and normal velocities are assumed to be periodic in the streamwise direction. The pressure difference between the left and right ends of the channel is constant. Moreover, we consider no-slip boundary conditions, that is, zero tangential velocity, at the top and bottom walls of the channel, and normal velocity actuation at the top and bottom walls. We design the boundary control inputs to achieve global exponential stabilization, in the L2 sense, of a chosen Poiseuille equilibrium profile for an arbitrarily large Reynolds number. The key idea behind our approach is to select the boundary controllers such that they have zero spatial mean (to guarantee mass conservation) but non-zero spatial cubic mean. We reveal that, because of convection, the time derivative of the L2 energy of the regulation error is a cubic polynomial in the cubic mean of the boundary inputs. Regulation is then achieved by solving a specific cubic equation, using the Cardano root formula. The results are illustrated via a numerical example.
comment: To be presented at the 63rd IEEE Conference on Decision and Control (CDC 2024)
Distributed Quasi-Newton Method for Multi-Agent Optimization
We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.
Multiagent Systems
Explaining Explaining
Explanation is key to people having confidence in high-stakes AI systems. However, machine-learning-based systems - which account for almost all current AI - can't explain because they are usually black boxes. The explainable AI (XAI) movement hedges this problem by redefining "explanation". The human-centered explainable AI (HCXAI) movement identifies the explanation-oriented needs of users but can't fulfill them because of its commitment to machine learning. In order to achieve the kinds of explanations needed by real people operating in critical domains, we must rethink how to approach AI. We describe a hybrid approach to developing cognitive agents that uses a knowledge-based infrastructure supplemented by data obtained through machine learning when applicable. These agents will serve as assistants to humans who will bear ultimate responsibility for the decisions and actions of the human-robot team. We illustrate the explanatory potential of such agents using the under-the-hood panels of a demonstration system in which a team of simulated robots collaborates on a search task assigned by a human.
HARMONIC: Cognitive and Control Collaboration in Human-Robotic Teams ICRA 2025
This paper presents a novel approach to multi-robot planning and collaboration. We demonstrate a cognitive strategy for robots in human-robot teams that incorporates metacognition, natural language communication, and explainability. The system is embodied using the HARMONIC architecture that flexibly integrates cognitive and control capabilities across the team. We evaluate our approach through simulation experiments involving a joint search task by a team of heterogeneous robots (a UGV and a drone) and a human. We detail the system's handling of complex, real-world scenarios, effective action coordination between robots with different capabilities, and natural human-robot communication. This work demonstrates that the robots' ability to reason about plans, goals, and attitudes, and to provide explanations for actions and decisions are essential prerequisites for realistic human-robot teaming.
comment: Submitted to ICRA 2025 Conference, Atlanta, GA, USA
HARMONIC: A Framework for Explanatory Cognitive Robots ICRA
We present HARMONIC, a framework for implementing cognitive robots that transforms general-purpose robots into trusted teammates capable of complex decision-making, natural communication and human-level explanation. The framework supports interoperability between a strategic (cognitive) layer for high-level decision-making and a tactical (robot) layer for low-level control and execution. We describe the core features of the framework and our initial implementation, in which HARMONIC was deployed on a simulated UGV and drone involved in a multi-robot search and retrieval task.
comment: Accepted for presentation at ICRA@40. 23-26 September 2024, Rotterdam, Netherlands
Control Industrial Automation System with Large Language Models
Traditional industrial automation systems require specialized expertise to operate and complex reprogramming to adapt to new processes. Large language models offer the intelligence to make them more flexible and easier to use. However, LLMs' application in industrial settings is underexplored. This paper introduces a framework for integrating LLMs to achieve end-to-end control of industrial automation systems. At the core of the framework are an agent system designed for industrial tasks, a structured prompting method, and an event-driven information modeling mechanism that provides real-time data for LLM inference. The framework supplies LLMs with real-time events on different context semantic levels, allowing them to interpret the information, generate production plans, and control operations on the automation system. It also supports structured dataset creation for fine-tuning on this downstream application of LLMs. Our contribution includes a formal system design, proof-of-concept implementation, and a method for generating task-specific datasets for LLM fine-tuning and testing. This approach enables a more adaptive automation system that can respond to spontaneous events, while allowing easier operation and configuration through natural language for more intuitive human-machine interaction. We provide demo videos and detailed data on GitHub: https://github.com/YuchenXia/LLM4IAS
Modular Autonomous Vehicle in Heterogeneous Traffic Flow: Modeling, Simulation, and Implication
Modular autonomous vehicles (MAVs) represent a groundbreaking concept that integrates modularity into the ongoing development of autonomous vehicles. This innovative design introduces unique features to traffic flow, allowing multiple modules to seamlessly join together and operate collectively. To understand the traffic flow characteristics involving these vehicles and their collective operations, this study established a modeling framework specifically designed to simulate their behavior within traffic flow. The mixed traffic flow, incorporating arbitrarily formed trains of various modular sizes, is modeled and studied. Simulations are conducted under varying levels of traffic demand and penetration rates to examine the traffic flow dynamics in the presence of these vehicles and their operations. The microscopic trajectories, MAV train compositions, and macroscopic fundamental diagrams of the mixed traffic flow are analyzed. The simulation findings indicate that integrating MAVs and their collective operations can substantially enhance capacity, with the extent of improvement depending on the penetration rate in mixed traffic flow. Notably, the capacity nearly doubles when the penetration rate exceeds 75%. Furthermore, their presence significantly influences and regulates the free-flow speed of the mixed traffic. Particularly, when variations in operational speed limits exist between the MAVs and the background traffic, the mixed traffic adjusts to the operating velocity of these vehicles. This study provides insights into potential future traffic flow systems incorporating emerging MAV technologies.
Multi-UAV Enabled MEC Networks: Optimizing Delay through Intelligent 3D Trajectory Planning and Resource Allocation
Mobile Edge Computing (MEC) reduces the computational burden on terminal devices by shortening the distance between these devices and computing nodes. Integrating Unmanned Aerial Vehicles (UAVs) with enhanced MEC networks can leverage the high mobility of UAVs to flexibly adjust network topology, further expanding the applicability of MEC. However, in highly dynamic and complex real-world environments, it is crucial to balance task offloading effectiveness with algorithm performance. This paper investigates a multi-UAV communication network equipped with edge computing nodes to assist terminal users in task computation. Our goal is to reduce the task processing delay for users through the joint optimization of discrete computation modes, continuous 3D trajectories, and resource assignment. To address the challenges posed by the mixed action space, we propose a Multi-UAV Edge Computing Resource Scheduling (MUECRS) algorithm, which comprises two key components: 1) trajectory optimization, and 2) computation mode and resource management. Experimental results demonstrate our method effectively designs the 3D flight trajectories of UAVs, enabling rapid terminal coverage. Furthermore, the proposed algorithm achieves efficient resource deployment and scheduling, outperforming comparative algorithms by at least 16.7%, demonstrating superior adaptability and robustness.
AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment
The increasing demand for intelligent assistants in human-populated environments has motivated significant research in autonomous robotic systems. Traditional service robots and virtual assistants, however, struggle with real-world task execution due to their limited capacity for dynamic reasoning and interaction, particularly when human collaboration is required. Recent developments in Large Language Models have opened new avenues for improving these systems, enabling more sophisticated reasoning and natural interaction capabilities. In this paper, we introduce AssistantX, an LLM-powered proactive assistant designed to operate autonomously in a physical office environment. Unlike conventional service robots, AssistantX leverages a novel multi-agent architecture, PPDR4X, which provides advanced inference capabilities and comprehensive collaboration awareness. By effectively bridging the gap between virtual operations and physical interactions, AssistantX demonstrates robust performance in managing complex real-world scenarios. Our evaluation highlights the architecture's effectiveness, showing that AssistantX can respond to clear instructions, actively retrieve supplementary information from memory, and proactively seek collaboration from team members to ensure successful task completion. More details and videos can be found at https://assistantx-agent.github.io/AssistantX/.
comment: 6 pages, 8 figures, 4 tables
CollaMamba: Efficient Collaborative Perception with Cross-Agent Spatial-Temporal State Space Model AAAI 2025
By sharing complementary perceptual information, multi-agent collaborative perception fosters a deeper understanding of the environment. Recent studies on collaborative perception mostly utilize CNNs or Transformers to learn feature representation and fusion in the spatial dimension, which struggle to handle long-range spatial-temporal features under limited computing and communication resources. Holistically modeling the dependencies over extensive spatial areas and extended temporal frames is crucial to enhancing feature quality. To this end, we propose a resource efficient cross-agent spatial-temporal collaborative state space model (SSM), named CollaMamba. Initially, we construct a foundational backbone network based on spatial SSM. This backbone adeptly captures positional causal dependencies from both single-agent and cross-agent views, yielding compact and comprehensive intermediate features while maintaining linear complexity. Furthermore, we devise a history-aware feature boosting module based on temporal SSM, extracting contextual cues from extended historical frames to refine vague features while preserving low overhead. Extensive experiments across several datasets demonstrate that CollaMamba outperforms state-of-the-art methods, achieving higher model accuracy while reducing computational and communication overhead by up to 71.9% and 1/64, respectively. This work pioneers the exploration of the Mamba's potential in collaborative perception. The source code will be made available.
comment: Submitted to AAAI 2025
Opponent Shaping for Antibody Development
Anti-viral therapies are typically designed to target the current strains of a virus. Game theoretically, this corresponds to a short-sighted, or myopic, response. However, therapy-induced selective pressures act on viral antigens to drive the emergence of mutated strains, against which initial therapies have reduced efficacy. Building on a computational model of binding between antibodies and viral antigens (the Absolut! framework), we design and implement a genetic simulation of such viral evolutionary escape. Crucially, this allows our antibody optimisation algorithm to consider and influence the entire escape curve of the virus, i.e. to guide (or ''shape'') the viral evolution. This is inspired by opponent shaping which, in general-sum learning, accounts for the adaptation of the co-player rather than playing a myopic best response. Hence we call the optimised antibodies shapers. Within our simulations, we demonstrate that our shapers target both current and simulated future viral variants, outperforming the antibodies chosen in a myopic way. Furthermore, we show that shapers exert specific evolutionary pressure on the virus compared to myopic antibodies. Altogether, shapers modify the evolutionary trajectories of viral strains and minimise the viral escape compared to their myopic counterparts. While this is a simplified model, we hope that our proposed paradigm will enable the discovery of better long-lived vaccines and antibody therapies in the future, enabled by rapid advancements in the capabilities of simulation tools. Our code is available at https://github.com/olakalisz/antibody-shapers.
comment: Preprint
ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination NeurIPS 2024
Zero-shot coordination (ZSC) is a new cooperative multi-agent reinforcement learning (MARL) challenge that aims to train an ego agent to work with diverse, unseen partners during deployment. The significant difference between the deployment-time partners' distribution and the training partners' distribution determined by the training algorithm makes ZSC a unique out-of-distribution (OOD) generalization challenge. The potential distribution gap between evaluation and deployment-time partners leads to inadequate evaluation, which is exacerbated by the lack of appropriate evaluation metrics. In this paper, we present ZSC-Eval, the first evaluation toolkit and benchmark for ZSC algorithms. ZSC-Eval consists of: 1) Generation of evaluation partner candidates through behavior-preferring rewards to approximate deployment-time partners' distribution; 2) Selection of evaluation partners by Best-Response Diversity (BR-Div); 3) Measurement of generalization performance with various evaluation partners via the Best-Response Proximity (BR-Prox) metric. We use ZSC-Eval to benchmark ZSC algorithms in Overcooked and Google Research Football environments and get novel empirical findings. We also conduct a human experiment of current ZSC algorithms to verify the ZSC-Eval's consistency with human evaluation. ZSC-Eval is now available at https://github.com/sjtu-marl/ZSC-Eval.
comment: Accepted in NeurIPS 2024 Dataset and Benchmark Track
Distributed Quasi-Newton Method for Multi-Agent Optimization
We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.
Robotics
Enhancing robot reliability for health-care facilities by means of Human-Aware Navigation Planning
With the aim of enabling robots to cooperate with humans, carry out human-like tasks, or navigate among humans, we need to ensure that they are equipped with the ability to comprehend human behaviors and use the extracted knowledge for intelligent decision-making. This ability is particularly important in the safety-critical and human-centred environment of health-care institutions. In the field of robotic navigation, the most cutting-edge approaches to enhancing robot reliability in the application domain of healthcare facilities and in general pertain to augmenting navigation systems with human-aware properties. To implement this in our work, the Co-operative Human-Aware Navigation planner has been integrated into the ROS-based differential-drive robot MARRtina and exhaustively challenged within various simulated contexts and scenarios (mainly modelling the situations relevant in the medical domain) to draw attention to the integrated system's benefits and identify its drawbacks or instances of poor performance while exploring the scope of system capabilities and creating a full characterization of its applicability. The simulation results are then presented to medical experts, and the enhanced robot acceptability within the domain is validated with them as the robot is further planned for deployment.
Blox-Net: Generative Design-for-Robot-Assembly Using VLM Supervision, Physics Simulation, and a Robot with Reset
Generative AI systems have shown impressive capabilities in creating text, code, and images. Inspired by the rich history of research in industrial ''Design for Assembly'', we introduce a novel problem: Generative Design-for-Robot-Assembly (GDfRA). The task is to generate an assembly based on a natural language prompt (e.g., ''giraffe'') and an image of available physical components, such as 3D-printed blocks. The output is an assembly, a spatial arrangement of these components, and instructions for a robot to build this assembly. The output must 1) resemble the requested object and 2) be reliably assembled by a 6 DoF robot arm with a suction gripper. We then present Blox-Net, a GDfRA system that combines generative vision language models with well-established methods in computer vision, simulation, perturbation analysis, motion planning, and physical robot experimentation to solve a class of GDfRA problems with minimal human supervision. Blox-Net achieved a Top-1 accuracy of 63.5% in the ''recognizability'' of its designed assemblies (eg, resembling giraffe as judged by a VLM). These designs, after automated perturbation redesign, were reliably assembled by a robot, achieving near-perfect success across 10 consecutive assembly iterations with human intervention only during reset prior to assembly. Surprisingly, this entire design process from textual word (''giraffe'') to reliable physical assembly is performed with zero human intervention.
comment: 8 pages, 7 Figures
PokeFlex: Towards a Real-World Dataset of Deformable Objects for Robotic Manipulation ICRA
Advancing robotic manipulation of deformable objects can enable automation of repetitive tasks across multiple industries, from food processing to textiles and healthcare. Yet robots struggle with the high dimensionality of deformable objects and their complex dynamics. While data-driven methods have shown potential for solving manipulation tasks, their application in the domain of deformable objects has been constrained by the lack of data. To address this, we propose PokeFlex, a pilot dataset featuring real-world 3D mesh data of actively deformed objects, together with the corresponding forces and torques applied by a robotic arm, using a simple poking strategy. Deformations are captured with a professional volumetric capture system that allows for complete 360-degree reconstruction. The PokeFlex dataset consists of five deformable objects with varying stiffness and shapes. Additionally, we leverage the PokeFlex dataset to train a vision model for online 3D mesh reconstruction from a single image and a template mesh. We refer readers to the supplementary material and to our website ( https://pokeflex-dataset.github.io/ ) for demos and examples of our dataset.
comment: Extended Abstract, 40th Anniversary of the IEEE International Conference on Robotics and Automation. (ICRA@40 Rotterdam 2024)
Hierarchical Tri-manual Planning for Vision-assisted Fruit Harvesting with Quadrupedal Robots
This paper addresses the challenge of developing a multi-arm quadrupedal robot capable of efficiently harvesting fruit in complex, natural environments. To overcome the inherent limitations of traditional bimanual manipulation, we introduce the first three-arm quadrupedal robot LocoHarv-3 and propose a novel hierarchical tri-manual planning approach, enabling automated fruit harvesting with collision-free trajectories. Our comprehensive semi-autonomous framework integrates teleoperation, supported by LiDAR-based odometry and mapping, with learning-based visual perception for accurate fruit detection and pose estimation. Validation is conducted through a series of controlled indoor experiments using motion capture and extensive field tests in natural settings. Results demonstrate a 90\% success rate in in-lab settings with a single attempt, and field trials further verify the system's robustness and efficiency in more challenging real-world environments.
comment: 7 pages, 8 figures
Towards human-like kinematics in industrial robotic arms: a case study on a UR3 robot
Safety in industrial robotic environments is a hot research topic in the area of human-robot interaction (HRI). Up to now, a robotic arm on an assembly line interacts with other machines away from human workers. Nowadays, robotic arm manufactures are aimed to their robots could increasingly perform tasks collaborating with humans. One of the ways to improve this collaboration is by making the movement of robots more humanlike. This way, it would be easier for a human to foresee the movement of the robot and approach it without fear of contact. The main difference between the movement of a human and of a robotic arm is that the former has a bell-shaped speed profile while the latter has a uniform speed one. To generate this speed profile, the kinematic theory of rapid human movements and its Sigma-Lognormal model has been used. This model is widely used to explain most of the basic phenomena related to the control of human movements. Both human-like and robotic-like movements are transferred to the UR3 robot. In this paper we detail the how the UR3 robot was programmed to produce both kinds of movement. The dissimilarities result between the input motion and output motion to the robot confirm the possibility to develop human-like velocities in the UR3 robot.
comment: 6 pages, 5 figures
Self-Sensing for Proprioception and Contact Detection in Soft Robots Using Shape Memory Alloy Artificial Muscles
Estimating a soft robot's pose and applied forces, also called proprioception, is crucial for safe interaction of the robot with its environment. However, most solutions for soft robot proprioception use dedicated sensors, particularly for external forces, which introduce design trade-offs, rigidity, and risk of failure. This work presents an approach for pose estimation and contact detection for soft robots actuated by shape memory alloy (SMA) artificial muscles, using no dedicated force sensors. Our framework uses the unique material properties of SMAs to self-sense their internal stress, via offboard measurements of their electrical resistance and in-situ temperature readings, in an existing fully-soft limb design. We demonstrate that a simple polynomial regression model on these measurements is sufficient to predict the robot's pose, under no-contact conditions. Then, we show that if an additional measurement of the true pose is available (e.g. from an already-in-place bending sensor), it is possible to predict a binary contact/no-contact using multiple combinations of self-sensing signals. Our hardware tests verify our hypothesis via a contact detection test with a human operator. This proof-of-concept validates that self-sensing signals in soft SMA-actuated soft robots can be used for proprioception and contact detection, and suggests a direction for integrating proprioception into soft robots without design compromises. Future work could employ machine learning for enhanced accuracy.
comment: 6 pages, 7 figures
Collision-free time-optimal path parameterization for multi-robot teams
Coordinating the motion of multiple robots in cluttered environments remains a computationally challenging task. We study the problem of minimizing the execution time of a set of geometric paths by a team of robots with state-dependent actuation constraints. We propose a Time-Optimal Path Parameterization (TOPP) algorithm for multiple car-like agents, where the modulation of the timing of every robot along its assigned path is employed to ensure collision avoidance and dynamic feasibility. This is achieved through the use of a priority queue to determine the order of trajectory execution for each robot while taking into account all possible collisions with higher priority robots in a spatiotemporal graph. We show a 10-20% reduction in makespan against existing state-of-the-art methods and validate our approach through simulations and hardware experiments.
Semantically-Driven Disambiguation for Human-Robot Interaction
Ambiguities are common in human-robot interaction, especially when a robot follows user instructions in a large collocated space. For instance, when the user asks the robot to find an object in a home environment, the object might be in several places depending on its varying semantic properties (e.g., a bowl can be in the kitchen cabinet or on the dining room table, depending on whether it is clean/dirty, full/empty and the other objects around it). Previous works on object semantics have predicted such relationships using one shot-inferences which are likely to fail for ambiguous or partially understood instructions. This paper focuses on this gap and suggests a semantically-driven disambiguation approach by utilizing follow-up clarifications to handle such uncertainties. To achieve this, we first obtain semantic knowledge embeddings, and then these embeddings are used to generate clarifying questions by following an iterative process. The evaluation of our method shows that our approach is model agnostic, i.e., applicable to different semantic embedding models, and follow-up clarifications improve the performance regardless of the embedding model. Additionally, our ablation studies show the significance of informative clarifications and iterative predictions to enhance system accuracies.
WasteGAN: Data Augmentation for Robotic Waste Sorting through Generative Adversarial Networks IROS 2024
Robotic waste sorting poses significant challenges in both perception and manipulation, given the extreme variability of objects that should be recognized on a cluttered conveyor belt. While deep learning has proven effective in solving complex tasks, the necessity for extensive data collection and labeling limits its applicability in real-world scenarios like waste sorting. To tackle this issue, we introduce a data augmentation method based on a novel GAN architecture called wasteGAN. The proposed method allows to increase the performance of semantic segmentation models, starting from a very limited bunch of labeled examples, such as few as 100. The key innovations of wasteGAN include a novel loss function, a novel activation function, and a larger generator block. Overall, such innovations helps the network to learn from limited number of examples and synthesize data that better mirrors real-world distributions. We then leverage the higher-quality segmentation masks predicted from models trained on the wasteGAN synthetic data to compute semantic-aware grasp poses, enabling a robotic arm to effectively recognizing contaminants and separating waste in a real-world scenario. Through comprehensive evaluation encompassing dataset-based assessments and real-world experiments, our methodology demonstrated promising potential for robotic waste sorting, yielding performance gains of up to 5.8\% in picking contaminants. The project page is available at https://github.com/bach05/wasteGAN.git
comment: Accepted at 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
Hydraulic Volumetric Soft Everting Vine Robot Steering Mechanism for Underwater Exploration
Despite a significant proportion of the Earth being covered in water, exploration of what lies below has been limited due to the challenges and difficulties inherent in the process. Current state of the art robots such as Remotely Operated Vehicles (ROVs) and Autonomous Underwater Vehicles (AUVs) are bulky, rigid and unable to conform to their environment. Soft robotics offers solutions to this issue. Fluid-actuated eversion or growing robots, in particular, are a good example. While current eversion robots have found many applications on land, their inherent properties make them particularly well suited to underwater environments. An important factor when considering underwater eversion robots is the establishment of a suitable steering mechanism that can enable the robot to change direction as required. This project proposes a design for an eversion robot that is capable of steering while underwater, through the use of bending pouches, a design commonly seen in the literature on land-based eversion robots. These bending pouches contract to enable directional change. Similar to their land-based counterparts, the underwater eversion robot uses the same fluid in the medium it operates in to achieve extension and bending but also to additionally aid in neutral buoyancy. The actuation method of bending pouches meant that robots needed to fully extend before steering was possible. Three robots, with the same design and dimensions were constructed from polyethylene tubes and tested. Our research shows that although the soft eversion robot design in this paper was not capable of consistently generating the same amounts of bending for the inflation volume, it still achieved suitable bending at a range of inflation volumes and was observed to bend to a maximum angle of 68 degrees at 2000 ml, which is in line with the bending angles reported for land-based eversion robots in the literature.
Efficient Submap-based Autonomous MAV Exploration using Visual-Inertial SLAM Configurable for LiDARs or Depth Cameras
Autonomous exploration of unknown space is an essential component for the deployment of mobile robots in the real world. Safe navigation is crucial for all robotics applications and requires accurate and consistent maps of the robot's surroundings. To achieve full autonomy and allow deployment in a wide variety of environments, the robot must rely on on-board state estimation which is prone to drift over time. We propose a Micro Aerial Vehicle (MAV) exploration framework based on local submaps to allow retaining global consistency by applying loop-closure corrections to the relative submap poses. To enable large-scale exploration we efficiently compute global, environment-wide frontiers from the local submap frontiers and use a sampling-based next-best-view exploration planner. Our method seamlessly supports using either a LiDAR sensor or a depth camera, making it suitable for different kinds of MAV platforms. We perform comparative evaluations in simulation against a state-of-the-art submap-based exploration framework to showcase the efficiency and reconstruction quality of our approach. Finally, we demonstrate the applicability of our method to real-world MAVs, one equipped with a LiDAR and the other with a depth camera. Video available at https://youtu.be/Uf5fwmYcuq4 .
comment: 7 pages, 8 figures, for the accompanying video see https://youtu.be/Uf5fwmYcuq4
Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning
Autonomous robots are being employed in several mapping and data collection tasks due to their efficiency and low labor costs. In these tasks, the robots are required to map targets-of-interest in an unknown environment while constrained to a given resource budget such as path length or mission time. This is a challenging problem as each robot has to not only detect and avoid collisions from static obstacles in the environment but also has to model other robots' trajectories to avoid inter-robot collisions. We propose a novel deep reinforcement learning approach for multi-robot informative path planning to map targets-of-interest in an unknown 3D environment. A key aspect of our approach is an augmented graph that models other robots' trajectories to enable planning for communication and inter-robot collision avoidance. We train our decentralized reinforcement learning policy via the centralized training and decentralized execution paradigm. Once trained, our policy is also scalable to varying number of robots and does not require re-training. Our approach outperforms other state-of-the-art multi-robot target mapping approaches by 33.75% in terms of the number of discovered targets-of-interest. We open-source our code and model at: https://github.com/AccGen99/marl_ipp
comment: arXiv admin note: text overlap with arXiv:2402.04894
DualLQR: Efficient Grasping of Oscillating Apples using Task Parameterized Learning from Demonstration ICRA2025
Learning from Demonstration offers great potential for robots to learn to perform agricultural tasks, specifically selective harvesting. One of the challenges is that the target fruit can be oscillating while approaching. Grasping oscillating targets has two requirements: 1) close tracking of the target during the final approach for damage-free grasping, and 2) the complete path should be as short as possible for improved efficiency. We propose a new method called DualLQR. In this method, we use a finite horizon Linear Quadratic Regulator (LQR) on a moving target, without the need of refitting the LQR. To make this possible, we use a dual LQR setup, with an LQR running in two seperate reference frames. Through extensive simulation testing, it was found that the state-of-art method barely meets the required final accuracy without oscillations and drops below the required accuracy with an oscillating target. DualLQR was found to be able to meet the required final accuracy even with high oscillations, with an accuracy increase of 60% for high orientation oscillations. Further testing on a real-world apple grasping task showed that DualLQR was able to successfully grasp oscillating apples, with a success rate of 99%.
comment: Submitted to ICRA2025
Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion
By framing reinforcement learning as a sequence modeling problem, recent work has enabled the use of generative models, such as diffusion models, for planning. While these models are effective in predicting long-horizon state trajectories in deterministic environments, they face challenges in dynamic settings with moving obstacles. Effective collision avoidance demands continuous monitoring and adaptive decision-making. While replanning at every timestep could ensure safety, it introduces substantial computational overhead due to the repetitive prediction of overlapping state sequences -- a process that is particularly costly with diffusion models, known for their intensive iterative sampling procedure. We propose an adaptive generative planning approach that dynamically adjusts replanning frequency based on the uncertainty of action predictions. Our method minimizes the need for frequent, computationally expensive, and redundant replanning while maintaining robust collision avoidance performance. In experiments, we obtain a 13.5% increase in the mean trajectory length and a 12.7% increase in mean reward over long-horizon planning, indicating a reduction in collision rates and an improved ability to navigate the environment safely.
Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM
We introduce Go-SLAM, a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments while embedding object-level information within the scene representations. This framework employs advanced object segmentation techniques, assigning a unique identifier to each Gaussian splat that corresponds to the object it represents. Consequently, our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions. Furthermore, the framework features an optimal path generation module that calculates efficient navigation paths for robots toward queried objects, considering obstacles and environmental uncertainties. Comprehensive evaluations in various scene settings demonstrate the effectiveness of our approach in delivering high-fidelity scene reconstructions, precise object segmentation, flexible object querying, and efficient robot path planning. This work represents an additional step forward in bridging the gap between 3D scene reconstruction, semantic object understanding, and real-time environment interactions.
Performance assessment of ADAS in a representative subset of critical traffic situations
As a variety of automated collision prevention systems gain presence within personal vehicles, rating and differentiating the automated safety performance of car models has become increasingly important for consumers, manufacturers, and insurers. In 2023, Swiss Re and partners initiated an eight-month long vehicle testing campaign conducted on a recognized UNECE type approval authority and Euro NCAP accredited proving ground in Germany. The campaign exposed twelve mass-produced vehicle models and one prototype vehicle fitted with collision prevention systems to a selection of safety-critical traffic scenarios representative of United States and European Union accident landscape. In this paper, we compare and evaluate the relative safety performance of these thirteen collision prevention systems (hardware and software stack) as demonstrated by this testing campaign. We first introduce a new scoring system which represents a test system's predicted impact on overall real-world collision frequency and reduction of collision impact energy, weighted based on the real-world relevance of the test scenario. Next, we introduce a novel metric that quantifies the realism of the protocol and confirm that our test protocol is a plausible representation of real-world driving. Finally, we find that the prototype system in its pre-release state outperforms the mass-produced (post-consumer-release) vehicles in the majority of the tested scenarios on the test track.
Let's Make a Splan: Risk-Aware Trajectory Optimization in a Normalized Gaussian Splat
Neural Radiance Fields and Gaussian Splatting have transformed the field of computer vision by enabling photo-realistic representation of complex scenes. Despite this success, they have seen only limited use in real-world robotics tasks such as trajectory optimization. Two key factors have contributed to this limited success. First, it is challenging to reason about collisions in radiance models. Second, it is difficult to perform inference of radiance models fast enough for real-time trajectory synthesis. This paper addresses these challenges by proposing SPLANNING, a risk-aware trajectory optimizer that operates in a Gaussian Splatting model. This paper first derives a method for rigorously upper-bounding the probability of collision between a robot and a radiance field. Second, this paper introduces a normalized reformulation of Gaussian Splatting that enables the efficient computation of the collision bound in a Gaussian Splat. Third, a method is presented to optimize trajectories while avoiding collisions with a scene represented by a Gaussian Splat. Experiments demonstrate that SPLANNING outperforms state-of-the-art methods in generating collision-free trajectories in highly cluttered environments. The proposed system is also tested on a real-world robot manipulator. A project page is available at https://roahmlab.github.io/splanning.
comment: First two authors contributed equally. Project Page: https://roahmlab.github.io/splanning
A Roadmap for Embodied and Social Grounding in LLMs
The fusion of Large Language Models (LLMs) and robotic systems has led to a transformative paradigm in the robotic field, offering unparalleled capabilities not only in the communication domain but also in skills like multimodal input handling, high-level reasoning, and plan generation. The grounding of LLMs knowledge into the empirical world has been considered a crucial pathway to exploit the efficiency of LLMs in robotics. Nevertheless, connecting LLMs' representations to the external world with multimodal approaches or with robots' bodies is not enough to let them understand the meaning of the language they are manipulating. Taking inspiration from humans, this work draws attention to three necessary elements for an agent to grasp and experience the world. The roadmap for LLMs grounding is envisaged in an active bodily system as the reference point for experiencing the environment, a temporally structured experience for a coherent, self-related interaction with the external world, and social skills to acquire a common-grounded shared experience.
comment: Accepted Version of a conference paper presented at Robophilosophy Conference 2024
Robotic Backchanneling in Online Conversation Facilitation: A Cross-Generational Study
Japan faces many challenges related to its aging society, including increasing rates of cognitive decline in the population and a shortage of caregivers. Efforts have begun to explore solutions using artificial intelligence (AI), especially socially embodied intelligent agents and robots that can communicate with people. Yet, there has been little research on the compatibility of these agents with older adults in various everyday situations. To this end, we conducted a user study to evaluate a robot that functions as a facilitator for a group conversation protocol designed to prevent cognitive decline. We modified the robot to use backchannelling, a natural human way of speaking, to increase receptiveness of the robot and enjoyment of the group conversation experience. We conducted a cross-generational study with young adults and older adults. Qualitative analyses indicated that younger adults perceived the backchannelling version of the robot as kinder, more trustworthy, and more acceptable than the non-backchannelling robot. Finally, we found that the robot's backchannelling elicited nonverbal backchanneling in older participants.
comment: Published at Proceedings of the 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2023)
Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous
This research introduces a novel application of a masked Proximal Policy Optimization (PPO) algorithm from the field of deep reinforcement learning (RL), for determining the most efficient sequence of space debris visitation, utilizing the Lambert solver as per Izzo's adaptation for individual rendezvous. The aim is to optimize the sequence in which all the given debris should be visited to get the least total time for rendezvous for the entire mission. A neural network (NN) policy is developed, trained on simulated space missions with varying debris fields. After training, the neural network calculates approximately optimal paths using Izzo's adaptation of Lambert maneuvers. Performance is evaluated against standard heuristics in mission planning. The reinforcement learning approach demonstrates a significant improvement in planning efficiency by optimizing the sequence for debris rendezvous, reducing the total mission time by an average of approximately {10.96\%} and {13.66\%} compared to the Genetic and Greedy algorithms, respectively. The model on average identifies the most time-efficient sequence for debris visitation across various simulated scenarios with the fastest computational speed. This approach signifies a step forward in enhancing mission planning strategies for space debris clearance.
comment: Accepted for publication at the 2024 International Conference on Space Robotics (iSpaRo)
GRACE: Generating Socially Appropriate Robot Actions Leveraging LLMs and Human Explanations ICRA
When operating in human environments, robots need to handle complex tasks while both adhering to social norms and accommodating individual preferences. For instance, based on common sense knowledge, a household robot can predict that it should avoid vacuuming during a social gathering, but it may still be uncertain whether it should vacuum before or after having guests. In such cases, integrating common-sense knowledge with human preferences, often conveyed through human explanations, is fundamental yet a challenge for existing systems. In this paper, we introduce GRACE, a novel approach addressing this while generating socially appropriate robot actions. GRACE leverages common sense knowledge from Large Language Models (LLMs), and it integrates this knowledge with human explanations through a generative network architecture. The bidirectional structure of GRACE enables robots to refine and enhance LLM predictions by utilizing human explanations and makes robots capable of generating such explanations for human-specified actions. Our experimental evaluations show that integrating human explanations boosts GRACE's performance, where it outperforms several baselines and provides sensible explanations.
comment: Under review for 2025 IEEE International Conference on Robotics & Automation (ICRA), Supplementary video: https://youtu.be/3gP3euwNBjQ
Behavior evolution-inspired approach to walking gait reinforcement training for quadruped robots
Reinforcement learning method is extremely competitive in gait generation techniques for quadrupedal robot, which is mainly due to the fact that stochastic exploration in reinforcement training is beneficial to achieve an autonomous gait. Nevertheless, although incremental reinforcement learning is employed to improve training success and movement smoothness by relying on the continuity inherent during limb movements, challenges remain in adapting gait policy to diverse terrain and external disturbance. Inspired by the association between reinforcement learning and the evolution of animal motion behavior, a self-improvement mechanism for reference gait is introduced in this paper to enable incremental learning of action and self-improvement of reference action together to imitate the evolution of animal motion behavior. Further, a new framework for reinforcement training of quadruped gait is proposed. In this framework, genetic algorithm is specifically adopted to perform global probabilistic search for the initial value of the arbitrary foot trajectory to update the reference trajectory with better fitness. Subsequently, the improved reference gait is used for incremental reinforcement learning of gait. The above process is repeatedly and alternatively executed to finally train the gait policy. The analysis considering terrain, model dimensions, and locomotion condition is presented in detail based on simulation, and the results show that the framework is significantly more adaptive to terrain compared to regular incremental reinforcement learning.
Communication Backbone Reconfiguration with Connectivity Maintenance
The exchange of information is key in applications that involve multiple agents, such as search and rescue, military operations, and disaster response. In this work, we propose a simple and effective trajectory planning framework that tackles the design, deployment, and reconfiguration of a communication backbone by reframing the problem of networked multi-agent motion planning as a manipulator motion planning problem. Our approach works for backbones of variable configurations both in terms of the number of robots utilized and the distance limit between each robot. While research has been conducted on connection-restricted navigation for multi-robot systems in the last years, the field of manipulators is arguably more developed both in theory and practice. Hence, our methodology facilitates practical applications built on top of widely available motion planning algorithms and frameworks for manipulators.
comment: Submitted to IEEE Latin America Transactions
CREVE: An Acceleration-based Constraint Approach for Robust Radar Ego-Velocity Estimation
Ego-velocity estimation from point cloud measurements of a millimeter-wave frequency-modulated continuous wave (mmWave FMCW) radar has become a crucial component of radar-inertial odometry (RIO) systems. Conventional approaches often perform poorly when the number of point cloud outliers exceeds that of inliers. In this paper, we propose CREVE, an acceleration-based inequality constraints filter that leverages additional measurements from an inertial measurement unit (IMU) to achieve robust ego-velocity estimations. To further enhance accuracy and robustness against sensor errors, we introduce a practical accelerometer bias estimation method and a parameter adaptation rule. The effectiveness of the proposed method is evaluated using five open-source drone datasets. Experimental results demonstrate that our algorithm significantly outperforms three existing state-of-the-art methods, achieving reductions in absolute trajectory error of approximately 53%, 84%, and 35% compared to them.
comment: 7 pages, conference
Conditional Generative Denoiser for Nighttime UAV Tracking
State-of-the-art (SOTA) visual object tracking methods have significantly enhanced the autonomy of unmanned aerial vehicles (UAVs). However, in low-light conditions, the presence of irregular real noise from the environments severely degrades the performance of these SOTA methods. Moreover, existing SOTA denoising techniques often fail to meet the real-time processing requirements when deployed as plug-and-play denoisers for UAV tracking. To address this challenge, this work proposes a novel conditional generative denoiser (CGDenoiser), which breaks free from the limitations of traditional deterministic paradigms and generates the noise conditioning on the input, subsequently removing it. To better align the input dimensions and accelerate inference, a novel nested residual Transformer conditionalizer is developed. Furthermore, an innovative multi-kernel conditional refiner is designed to pertinently refine the denoised output. Extensive experiments show that CGDenoiser promotes the tracking precision of the SOTA tracker by 18.18\% on DarkTrack2021 whereas working 5.8 times faster than the second well-performed denoiser. Real-world tests with complex challenges also prove the effectiveness and practicality of CGDenoiser. Code, video demo and supplementary proof for CGDenoier are now available at: \url{https://github.com/vision4robotics/CGDenoiser}.
OffRIPP: Offline RL-based Informative Path Planning ICRA 2025
Informative path planning (IPP) is a crucial task in robotics, where agents must design paths to gather valuable information about a target environment while adhering to resource constraints. Reinforcement learning (RL) has been shown to be effective for IPP, however, it requires environment interactions, which are risky and expensive in practice. To address this problem, we propose an offline RL-based IPP framework that optimizes information gain without requiring real-time interaction during training, offering safety and cost-efficiency by avoiding interaction, as well as superior performance and fast computation during execution -- key advantages of RL. Our framework leverages batch-constrained reinforcement learning to mitigate extrapolation errors, enabling the agent to learn from pre-collected datasets generated by arbitrary algorithms. We validate the framework through extensive simulations and real-world experiments. The numerical results show that our framework outperforms the baselines, demonstrating the effectiveness of the proposed approach.
comment: 7 pages, 6 figures, submitted to ICRA 2025
On the role of Artificial Intelligence methods in modern force-controlled manufacturing robotic tasks
This position paper explores the integration of Artificial Intelligence (AI) into force-controlled robotic tasks within the scope of advanced manufacturing, a cornerstone of Industry 4.0. AI's role in enhancing robotic manipulators - key drivers in the Fourth Industrial Revolution - is rapidly leading to significant innovations in smart manufacturing. The objective of this article is to frame these innovations in practical force-controlled applications - e.g. deburring, polishing, and assembly tasks like peg-in-hole (PiH) - highlighting their necessity for maintaining high-quality production standards. By reporting on recent AI-based methodologies, this article contrasts them and identifies current challenges to be addressed in future research. The analysis concludes with a perspective on future research directions, emphasizing the need for common performance metrics to validate AI techniques, integration of various enhancements for performance optimization, and the importance of validating them in relevant scenarios. These future directions aim to provide consistency with already adopted approaches, so as to be compatible with manufacturing standards, increasing the relevance of AI-driven methods in both academic and industrial contexts.
comment: To be published in Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics (ICINCO)
Inline Photometrically Calibrated Hybrid Visual SLAM
This paper presents an integrated approach to Visual SLAM, merging online sequential photometric calibration within a Hybrid direct-indirect visual SLAM (H-SLAM). Photometric calibration helps normalize pixel intensity values under different lighting conditions, and thereby improves the direct component of our H-SLAM. A tangential benefit also results to the indirect component of H-SLAM given that the detected features are more stable across variable lighting conditions. Our proposed photometrically calibrated H-SLAM is tested on several datasets, including the TUM monoVO as well as on a dataset we created. Calibrated H-SLAM outperforms other state of the art direct, indirect, and hybrid Visual SLAM systems in all the experiments. Furthermore, in online SLAM tested at our site, it also significantly outperformed the other SLAM Systems.
Do We Need iPhone Moment or Xiaomi Moment for Robots? Design of Affordable Home Robots for Health Monitoring
In this paper, we study cost-effective home robot solutions which are designed for home health monitoring. The recent advancements in Artificial Intelligence (AI) have significantly advanced the capabilities of the robots, enabling them to better and efficiently understand and interact with their surroundings. The most common robots currently used in homes are toy robots and cleaning robots. While these are relatively affordable, their functionalities are very limited. On the other hand, humanoid and quadruped robots offer more sophisticated features and capabilities, albeit at a much higher cost. Another category is educational robots, which provide educators with the flexibility to attach various sensors and integrate different design methods with the integrated operating systems. However, the challenge still exists in bridging the gap between affordability and functionality. Our research aims to address this by exploring the potential of developing advanced yet affordable and accessible robots for home robots, aiming for health monitoring, by using edge computing techniques and taking advantage of existing computing resources for home robots, such as mobile phones.
Programming of Skill-based Robots
Manufacturing is facing ever changing market demands, with faster innovation cycles resulting to growing agility and flexibility requirements. Industry 4.0 has been transforming the manufacturing world towards digital automation and the importance of software has increased drastically. Easy and fast task programming and execution in robot - sensor systems become a prerequisite for agile and flexible automation and in this paper, we propose such a system. Our solution relies on a robot skill library, which provides the user with high level and parametrized operations, i.e., robot skills, for task programming and execution. Programming actions results to a control recipe in a neutral product context and is based on use of product CAD models or alternatively collaborative use of pointers and tracking sensor with real parts. Practical tests are also reported to show the feasibility of our approach.
comment: IEEE ICIEA 2024
World Model-based Perception for Visual Legged Locomotion
Legged locomotion over various terrains is challenging and requires precise perception of the robot and its surroundings from both proprioception and vision. However, learning directly from high-dimensional visual input is often data-inefficient and intricate. To address this issue, traditional methods attempt to learn a teacher policy with access to privileged information first and then learn a student policy to imitate the teacher's behavior with visual input. Despite some progress, this imitation framework prevents the student policy from achieving optimal performance due to the information gap between inputs. Furthermore, the learning process is unnatural since animals intuitively learn to traverse different terrains based on their understanding of the world without privileged knowledge. Inspired by this natural ability, we propose a simple yet effective method, World Model-based Perception (WMP), which builds a world model of the environment and learns a policy based on the world model. We illustrate that though completely trained in simulation, the world model can make accurate predictions of real-world trajectories, thus providing informative signals for the policy controller. Extensive simulated and real-world experiments demonstrate that WMP outperforms state-of-the-art baselines in traversability and robustness. Videos and Code are available at: https://wmp-loco.github.io/.
comment: under review
Dashing for the Golden Snitch: Multi-Drone Time-Optimal Motion Planning with Multi-Agent Reinforcement Learning
Recent innovations in autonomous drones have facilitated time-optimal flight in single-drone configurations and enhanced maneuverability in multi-drone systems through the application of optimal control and learning-based methods. However, few studies have achieved time-optimal motion planning for multi-drone systems, particularly during highly agile maneuvers or in dynamic scenarios. This paper presents a decentralized policy network for time-optimal multi-drone flight using multi-agent reinforcement learning. To strike a balance between flight efficiency and collision avoidance, we introduce a soft collision penalty inspired by optimization-based methods. By customizing PPO in a centralized training, decentralized execution (CTDE) fashion, we unlock higher efficiency and stability in training, while ensuring lightweight implementation. Extensive simulations show that, despite slight performance trade-offs compared to single-drone systems, our multi-drone approach maintains near-time-optimal performance with low collision rates. Real-world experiments validate our method, with two quadrotors using the same network as simulation achieving a maximum speed of 13.65 m/s and a maximum body rate of 13.4 rad/s in a 5.5 m * 5.5 m * 2.0 m space across various tracks, relying entirely on onboard computation.
comment: 7 pages, 6 figures
Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification EMNLP 2024
Recent advances in fine-tuning Vision-Language Models (VLMs) have witnessed the success of prompt tuning and adapter tuning, while the classic model fine-tuning on inherent parameters seems to be overlooked. It is believed that fine-tuning the parameters of VLMs with few-shot samples corrupts the pre-trained knowledge since fine-tuning the CLIP model even degrades performance. In this paper, we revisit this viewpoint, and propose a new perspective: fine-tuning the specific parameters instead of all will uncover the power of classic model fine-tuning on VLMs. Through our meticulous study, we propose ClipFit, a simple yet effective method to fine-tune CLIP without introducing any overhead of extra parameters. We demonstrate that by only fine-tuning the specific bias terms and normalization layers, ClipFit can improve the performance of zero-shot CLIP by 7.27\% average harmonic mean accuracy. Lastly, to understand how fine-tuning in CLIPFit affects the pre-trained models, we conducted extensive experimental analyses w.r.t. changes in internal parameters and representations. We found that low-level text bias layers and the first layer normalization layer change much more than other layers. The code is available at \url{https://github.com/minglllli/CLIPFit}.
comment: EMNLP 2024 Main Conference
Online 6DoF Pose Estimation in Forests using Cross-View Factor Graph Optimisation and Deep Learned Re-localisation ICRA2025
This paper presents a novel approach for robust global localisation and 6DoF pose estimation of ground robots in forest environments by leveraging cross-view factor graph optimisation and deep-learned re-localisation. The proposed method addresses the challenges of aligning aerial and ground data for pose estimation, which is crucial for accurate point-to-point navigation in GPS-denied environments. By integrating information from both perspectives into a factor graph framework, our approach effectively estimates the robot's global position and orientation. We validate the performance of our method through extensive experiments in diverse forest scenarios, demonstrating its superiority over existing baselines in terms of accuracy and robustness in these challenging environments. Experimental results show that our proposed localisation system can achieve drift-free localisation with bounded positioning errors, ensuring reliable and safe robot navigation under canopies.
comment: 7 pages, 4 figures, Submitted to ICRA2025
Multirotor Nonlinear Model Predictive Control based on Visual Servoing of Evolving Features
This article presents a Visual Servoing Nonlinear Model Predictive Control (NMPC) scheme for autonomously tracking a moving target using multirotor Unmanned Aerial Vehicles (UAVs). The scheme is developed for surveillance and tracking of contour-based areas with evolving features. NMPC is used to manage input and state constraints, while additional barrier functions are incorporated in order to ensure system safety and optimal performance. The proposed control scheme is designed based on the extraction and implementation of the full dynamic model of the features describing the target and the state variables. Real-time simulations and experiments using a quadrotor UAV equipped with a camera demonstrate the effectiveness of the proposed strategy.
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2025
We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations, so that at runtime it can recover from perturbations outside the training distribution. Additionally, we introduce a novel transformer-based perception encoder that employs multi-view cross-attention and a learned scene query. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing in the CARLA simulator, as well as showing the ability to handle perturbations in both CARLA and NVIDIA's DRIVE Sim.
comment: 7 pages, 6 figures, for ICRA 2025 conference, for associated video file, see https://youtu.be/9FpDFD9aiFU
Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning
Humanoid robots offer significant versatility for performing a wide range of tasks, yet their basic ability to walk and run, especially at high velocities, remains a challenge. This letter presents a novel method that combines deep reinforcement learning with kinodynamic priors to achieve stable locomotion control (KSLC). KSLC promotes coordinated arm movements to counteract destabilizing forces, enhancing overall stability. Compared to the baseline method, KSLC provides more accurate tracking of commanded velocities and better generalization in velocity control. In simulation tests, the KSLC-enabled humanoid robot successfully tracked a target velocity of 3.5 m/s with reduced fluctuations. Sim-to-sim validation in a high-fidelity environment further confirmed its robust performance, highlighting its potential for real-world applications.
comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots
Mobile smartphones compactly provide sensors such as cameras, IMUs, GNSS measurement units, and wireless and wired communication channels required for robotics projects. They are affordable, portable, and programmable, which makes them ideal for testing, data acquisition, controlling mobile robots, and many other robotic applications. A robotic system is proposed in this paper, consisting of an Android phone, a microcontroller board attached to the phone via USB, and a remote wireless controller station. In the data acquisition mode, the Android device can record a dataset of a diverse configuration of multiple cameras, IMUs, GNSS units, and external USB ADC channels in the rawest format used for, but not limited to, pose estimation and scene reconstruction applications. In robot control mode, the Android phone, a microcontroller board, and other peripherals constitute the mobile or stationary robotic system. This system is controlled using a remote server connected over Wi-Fi or Bluetooth. Experiments show that although the SLAM and AR applications can utilize the acquired data, the proposed system can pave the way for more advanced algorithms for processing these noisy and sporadic measurements. Moreover, the characteristics of the communication media are studied, and two example robotic projects, which involve controlling a toy car and a quadcopter, are included.
comment: Project repository: https://github.com/m-dayani/robo-platform Youtube Video: https://youtu.be/BTQ4yLB1bak Dataset: https://drive.google.com/drive/folders/1OZqdA1xa-SyJ64qL_TibqhtwhR1fWWrx?usp=sharing
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
In recent years, the Robotics field has initiated several efforts toward building generalist robot policies through large-scale multi-task Behavior Cloning. However, direct deployments of these policies have led to unsatisfactory performance, where the policy struggles with unseen states and tasks. How can we break through the performance plateau of these models and elevate their capabilities to new heights? In this paper, we propose FLaRe, a large-scale Reinforcement Learning fine-tuning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques. Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance both on previously demonstrated and on entirely novel tasks and embodiments. Specifically, on a set of long-horizon mobile manipulation tasks, FLaRe achieves an average success rate of 79.5% in unseen environments, with absolute improvements of +23.6% in simulation and +30.7% on real robots over prior SoTA methods. By utilizing only sparse rewards, our approach can enable generalizing to new capabilities beyond the pretraining data with minimal human effort. Moreover, we demonstrate rapid adaptation to new embodiments and behaviors with less than a day of fine-tuning. Videos can be found on the project website at https://robot-flare.github.io/
Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape
Compared with single robots, Multi-Robot Systems (MRS) can perform missions more efficiently due to the presence of multiple members with diverse capabilities. However, deploying an MRS in wide real-world environments is still challenging due to uncertain and various obstacles (e.g., building clusters and trees). With a limited understanding of environmental uncertainty on performance, an MRS cannot flexibly adjust its behaviors (e.g., teaming, load sharing, trajectory planning) to ensure both environment adaptation and task accomplishments. In this work, a novel joint preference landscape learning and behavior adjusting framework (PLBA) is designed. PLBA efficiently integrates real-time human guidance to MRS coordination and utilizes Sparse Variational Gaussian Processes with Varying Output Noise to quickly assess human preferences by leveraging spatial correlations between environment characteristics. An optimization-based behavior-adjusting method then safely adapts MRS behaviors to environments. To validate PLBA's effectiveness in MRS behavior adaption, a flood disaster search and rescue task was designed. 20 human users provided 1764 feedback based on human preferences obtained from MRS behaviors related to "task quality", "task progress", "robot safety". The prediction accuracy and adaptation speed results show the effectiveness of PLBA in preference learning and MRS behavior adaption.
Task-driven SLAM Benchmarking ICRA2025
For assistive robots, one critical use case of SLAM is to support localization as they navigate through an environment completing tasks. Current SLAM benchmarks do not consider task-based deployments where repeatability (precision) is more critical than accuracy. To address this gap, we propose a task-driven benchmarking framework for evaluating SLAM methods. The framework accounts for SLAM's mapping capabilities, employs precision as a key metric, and has low resource requirements to implement. Testing of state-of-the-art SLAM methods in both simulated and real-world scenarios provides insights into the performance properties of modern SLAM solutions. In particular, it shows that passive stereo SLAM operates at a level of precision comparable to LiDAR-based SLAM in typical indoor environments. The benchmarking approach offers a more relevant and accurate assessment of SLAM performance in task-driven applications.
comment: 7 pages, 7 figures, 1 table. Submitted to ICRA2025
PANOS: Payload-Aware Navigation in Offroad Scenarios
Nature has evolved humans to walk on different terrains by developing a detailed understanding of their physical characteristics. Similarly, legged robots need to develop their capability to walk on complex terrains with a variety of task-dependent payloads to achieve their goals. However, conventional terrain adaptation methods are susceptible to failure with varying payloads. In this work, we introduce PANOS, a weakly supervised approach that integrates proprioception and exteroception from onboard sensing to achieve a stable gait while walking by a legged robot over various terrains. Our work also provides evidence of its adaptability over varying payloads. We evaluate our method on multiple terrains and payloads using a legged robot. PANOS improves the stability up to 44% without any payload and 53% with 15 lbs payload. We also notice a reduction in the vibration cost of 20% with the payload for various terrain types when compared to state-of-the-art methods.
Real-World Data Inspired Interactive Connected Traffic Scenario Generation
Simulation is a crucial step in ensuring accurate, efficient, and realistic Connected and Autonomous Vehicles (CAVs) testing and validation. As the adoption of CAV accelerates, the integration of real-world data into simulation environments becomes increasingly critical. Among various technologies utilized by CAVs, Vehicle-to-Everything (V2X) communication plays a crucial role in ensuring a seamless transmission of information between CAVs, infrastructure, and other road users. However, most existing studies have focused on developing and testing communication protocols, resource allocation strategies, and data dissemination techniques in V2X. There is a gap where real-world V2X data is integrated into simulations to generate diverse and high-fidelity traffic scenarios. To fulfill this research gap, we leverage real-world Signal Phase and Timing (SPaT) data from Roadside Units (RSUs) to enhance the fidelity of CAV simulations. Moreover, we developed an algorithm that enables Autonomous Vehicles (AVs) to respond dynamically to real-time traffic signal data, simulating realistic V2X communication scenarios. Such high-fidelity simulation environments can generate multimodal data, including trajectory, semantic camera, depth camera, and bird's eye view data for various traffic scenarios. The generated scenarios and data provide invaluable insights into AVs' interactions with traffic infrastructure and other road users. This work aims to bridge the gap between theoretical research and practical deployment of CAVs, facilitating the development of smarter and safer transportation systems.
An Anatomy-Aware Shared Control Approach for Assisted Teleoperation of Lung Ultrasound Examinations
The introduction of artificial intelligence and robotics in telehealth is enabling personalised treatment and supporting teleoperated procedures such as lung ultrasound, which has gained attention during the COVID-19 pandemic. Although fully autonomous systems face challenges due to anatomical variability, teleoperated systems appear to be more practical in current healthcare settings. This paper presents an anatomy-aware control framework for teleoperated lung ultrasound. Using biomechanically accurate 3D models such as SMPL and SKEL, the system provides a real-time visual feedback and applies virtual constraints to assist in precise probe placement tasks. Evaluations on five subjects show the accuracy of the biomechanical models and the efficiency of the system in improving probe placement and reducing procedure time compared to traditional teleoperation. The results demonstrate that the proposed framework enhances the physician's capabilities in executing remote lung ultrasound examinations, towards more objective and repeatable acquisitions.
Safe Leaf Manipulation for Accurate Shape and Pose Estimation of Occluded Fruits ICRA 2025
Fruit monitoring plays an important role in crop management, and rising global fruit consumption combined with labor shortages necessitates automated monitoring with robots. However, occlusions from plant foliage often hinder accurate shape and pose estimation. Therefore, we propose an active fruit shape and pose estimation method that physically manipulates occluding leaves to reveal hidden fruits. This paper introduces a framework that plans robot actions to maximize visibility and minimize leaf damage. We developed a novel scene-consistent shape completion technique to improve fruit estimation under heavy occlusion and utilize a perception-driven deformation graph model to predict leaf deformation during planning. Experiments on artificial and real sweet pepper plants demonstrate that our method enables robots to safely move leaves aside, exposing fruits for accurate shape and pose estimation, outperforming baseline methods. Project page: https://shaoxiongyao.github.io/lmap-ssc/.
comment: Shaoxiong Yao and Sicong Pan have equal contributions. Submitted to ICRA 2025
Decentralized Nonlinear Model Predictive Control for Safe Collision Avoidance in Quadrotor Teams with Limited Detection Range ICRA
Multi-quadrotor systems face significant challenges in decentralized control, particularly with safety and coordination under sensing and communication limitations. State-of-the-art methods leverage Control Barrier Functions (CBFs) to provide safety guarantees but often neglect actuation constraints and limited detection range. To address these gaps, we propose a novel decentralized Nonlinear Model Predictive Control (NMPC) that integrates Exponential CBFs (ECBFs) to enhance safety and optimality in multi-quadrotor systems. We provide both conservative and practical minimum bounds of the range that preserve the safety guarantees of the ECBFs. We validate our approach through extensive simulations with up to 10 quadrotors and 20 obstacles, as well as real-world experiments with 3 quadrotors. Results demonstrate the effectiveness of the proposed framework in realistic settings, highlighting its potential for reliable quadrotor teams operations.
comment: 7 pages, 5 figures, Submitted to the IEEE International Conference on Robotics and Automation (ICRA) 2025
Data-driven Probabilistic Trajectory Learning with High Temporal Resolution in Terminal Airspace
Predicting flight trajectories is a research area that holds significant merit. In this paper, we propose a data-driven learning framework, that leverages the predictive and feature extraction capabilities of the mixture models and seq2seq-based neural networks while addressing prevalent challenges caused by error propagation and dimensionality reduction. After training with this framework, the learned model can improve long-step prediction accuracy significantly given the past trajectories and the context information. The accuracy and effectiveness of the approach are evaluated by comparing the predicted trajectories with the ground truth. The results indicate that the proposed method has outperformed the state-of-the-art predicting methods on a terminal airspace flight trajectory dataset. The trajectories generated by the proposed method have a higher temporal resolution(1 timestep per second vs 0.1 timestep per second) and are closer to the ground truth.
comment: Submitted to AIAA-JAIS
SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model
We introduce SeaSplat, a method to enable real-time rendering of underwater scenes leveraging recent advances in 3D radiance fields. Underwater scenes are challenging visual environments, as rendering through a medium such as water introduces both range and color dependent effects on image capture. We constrain 3D Gaussian Splatting (3DGS), a recent advance in radiance fields enabling rapid training and real-time rendering of full 3D scenes, with a physically grounded underwater image formation model. Applying SeaSplat to the real-world scenes from SeaThru-NeRF dataset, a scene collected by an underwater vehicle in the US Virgin Islands, and simulation-degraded real-world scenes, not only do we see increased quantitative performance on rendering novel viewpoints from the scene with the medium present, but are also able to recover the underlying true color of the scene and restore renders to be without the presence of the intervening medium. We show that the underwater image formation helps learn scene structure, with better depth maps, as well as show that our improvements maintain the significant computational improvements afforded by leveraging a 3D Gaussian representation.
comment: Project page here: https://seasplat.github.io
Koopman-driven grip force prediction through EMG sensing
Loss of hand function due to conditions like stroke or multiple sclerosis significantly impacts daily activities. Robotic rehabilitation provides tools to restore hand function, while novel methods based on surface electromyography (sEMG) enable the adaptation of the device's force output according to the user's condition, thereby improving rehabilitation outcomes. This study aims to achieve accurate force estimations during medium wrap grasps using a single sEMG sensor pair, thereby addressing the challenge of escalating sensor requirements for precise predictions. We conducted sEMG measurements on 13 subjects at two forearm positions, validating results with a hand dynamometer. We established flexible signal-processing steps, yielding high peak cross-correlations between the processed sEMG signal (representing meaningful muscle activity) and grip force. Influential parameters were subsequently identified through sensitivity analysis. Leveraging a novel data-driven Koopman operator theory-based approach and problem-specific data lifting techniques, we devised a methodology for the estimation and short-term prediction of grip force from processed sEMG signals. A weighted mean absolute percentage error (wMAPE) of approx. 5.5% was achieved for the estimated grip force, whereas predictions with a 0.5-second prediction horizon resulted in a wMAPE of approx. 17.9%. The methodology proved robust regarding precise electrode positioning, as the effect of sensing position on error metrics was non-significant. The algorithm executes exceptionally fast, processing, estimating, and predicting a 0.5-second sEMG signal batch in just approx. 30 ms, facilitating real-time implementation.
comment: 11 pages, 8 figures, journal
Building Real-time Awareness of Out-of-distribution in Trajectory Prediction for Autonomous Vehicles
Trajectory prediction describes the motions of surrounding moving obstacles for an autonomous vehicle; it plays a crucial role in enabling timely decision-making, such as collision avoidance and trajectory replanning. Accurate trajectory planning is the key to reliable vehicle deployments in open-world environment, where unstructured obstacles bring in uncertainties that are impossible to fully capture by training data. For traditional machine learning tasks, such uncertainties are often addressed reasonably well via methods such as continual learning. On the one hand, naively applying those methods to trajectory prediction can result in continuous data collection and frequent model updates, which can be resource-intensive. On the other hand, the predicted trajectories can be far away from the true trajectories, leading to unsafe decision-making. In this paper, we aim to establish real-time awareness of out-of-distribution in trajectory prediction for autonomous vehicles. We focus on the challenging and practically relevant setting where the out-of-distribution is deceptive, that is, the one not easily detectable by human intuition. Drawing on the well-established techniques of sequential analysis, we build real-time awareness of out-of-distribution by monitoring prediction errors using the quickest change point detection (QCD). Our solutions are lightweight and can handle the occurrence of out-of-distribution at any time during trajectory prediction inference. Experimental results on multiple real-world datasets using a benchmark trajectory prediction model demonstrate the effectiveness of our methods.
CROSS-GAiT: Cross-Attention-Based Multimodal Representation Fusion for Parametric Gait Adaptation in Complex Terrains
We present CROSS-GAiT, a novel algorithm for quadruped robots that uses Cross Attention to fuse terrain representations derived from visual and time-series inputs, including linear accelerations, angular velocities, and joint efforts. These fused representations are used to adjust the robot's step height and hip splay, enabling adaptive gaits that respond dynamically to varying terrain conditions. We generate these terrain representations by processing visual inputs through a masked Vision Transformer (ViT) encoder and time-series data through a dilated causal convolutional encoder. The cross-attention mechanism then selects and integrates the most relevant features from each modality, combining terrain characteristics with robot dynamics for better-informed gait adjustments. CROSS-GAiT uses the combined representation to dynamically adjust gait parameters in response to varying and unpredictable terrains. We train CROSS-GAiT on data from diverse terrains, including asphalt, concrete, brick pavements, grass, dense vegetation, pebbles, gravel, and sand. Our algorithm generalizes well and adapts to unseen environmental conditions, enhancing real-time navigation performance. CROSS-GAiT was implemented on a Ghost Robotics Vision 60 robot and extensively tested in complex terrains with high vegetation density, uneven/unstable surfaces, sand banks, deformable substrates, etc. We observe at least a 7.04% reduction in IMU energy density and a 27.3% reduction in total joint effort, which directly correlates with increased stability and reduced energy usage when compared to state-of-the-art methods. Furthermore, CROSS-GAiT demonstrates at least a 64.5% increase in success rate and a 4.91% reduction in time to reach the goal in four complex scenarios. Additionally, the learned representations perform 4.48% better than the state-of-the-art on a terrain classification task.
2024 BRAVO Challenge Track 1 1st Place Report: Evaluating Robustness of Vision Foundation Models for Semantic Segmentation
In this report, we present our solution for Track 1 of the 2024 BRAVO Challenge, where a model is trained on Cityscapes and its robustness is evaluated on several out-of-distribution datasets. Our solution leverages the powerful representations learned by vision foundation models, by attaching a simple segmentation decoder to DINOv2 and fine-tuning the entire model. This approach outperforms more complex existing approaches, and achieves 1st place in the challenge. Our code is publicly available at https://github.com/tue-mps/benchmark-vfm-ss.
comment: arXiv admin note: substantial text overlap with arXiv:2409.15107
Efficient Motion Prediction: A Lightweight & Accurate Trajectory Prediction Model With Fast Training and Inference Speed IROS 2024
For efficient and safe autonomous driving, it is essential that autonomous vehicles can predict the motion of other traffic agents. While highly accurate, current motion prediction models often impose significant challenges in terms of training resource requirements and deployment on embedded hardware. We propose a new efficient motion prediction model, which achieves highly competitive benchmark results while training only a few hours on a single GPU. Due to our lightweight architectural choices and the focus on reducing the required training resources, our model can easily be applied to custom datasets. Furthermore, its low inference latency makes it particularly suitable for deployment in autonomous applications with limited computing resources.
comment: Accepted to IROS 2024
A Learning Framework for Diverse Legged Robot Locomotion Using Barrier-Based Style Rewards
This work introduces a model-free reinforcement learning framework that enables various modes of motion (quadruped, tripod, or biped) and diverse tasks for legged robot locomotion. We employ a motion-style reward based on a relaxed logarithmic barrier function as a soft constraint, to bias the learning process toward the desired motion style, such as gait, foot clearance, joint position, or body height. The predefined gait cycle is encoded in a flexible manner, facilitating gait adjustments throughout the learning process. Extensive experiments demonstrate that KAIST HOUND, a 45 kg robotic system, can achieve biped, tripod, and quadruped locomotion using the proposed framework; quadrupedal capabilities include traversing uneven terrain, galloping at 4.67 m/s, and overcoming obstacles up to 58 cm (67 cm for HOUND2); bipedal capabilities include running at 3.6 m/s, carrying a 7.5 kg object, and ascending stairs-all performed without exteroceptive input.
comment: 7 pages, 5 figures, Videos at https://youtu.be/fYH0Dmpyybo
Precision Aquaculture: An Integrated Computer Vision and IoT Approach for Optimized Tilapia Feeding
Traditional fish farming practices often lead to inefficient feeding, resulting in environmental issues and reduced productivity. We developed an innovative system combining computer vision and IoT technologies for precise Tilapia feeding. Our solution uses real-time IoT sensors to monitor water quality parameters and computer vision algorithms to analyze fish size and count, determining optimal feed amounts. A mobile app enables remote monitoring and control. We utilized YOLOv8 for keypoint detection to measure Tilapia weight from length, achieving \textbf{94\%} precision on 3,500 annotated images. Pixel-based measurements were converted to centimeters using depth estimation for accurate feeding calculations. Our method, with data collection mirroring inference conditions, significantly improved results. Preliminary estimates suggest this approach could increase production up to 58 times compared to traditional farms. Our models, code, and dataset are open-source~\footnote{The code, dataset, and models are available upon reasonable request.
comment: 8 pages, 6 figures, 3 tables, 21th International Conference on Informatics in Control, Automation, and Robotics
COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models ICRA
Leveraging the powerful reasoning capabilities of large language models (LLMs), recent LLM-based robot task planning methods yield promising results. However, they mainly focus on single or multiple homogeneous robots on simple tasks. Practically, complex long-horizon tasks always require collaborations among multiple heterogeneous robots especially with more complex action spaces, which makes these tasks more challenging. To this end, we propose COHERENT, a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems including quadrotors, robotic dogs, and robotic arms. Specifically, a Proposal-Execution-Feedback-Adjustment (PEFA) mechanism is designed to decompose and assign actions for individual robots, where a centralized task assigner makes a task planning proposal to decompose the complex task into subtasks, and then assigns subtasks to robot executors. Each robot executor selects a feasible action to implement the assigned subtask and reports self-reflection feedback to the task assigner for plan adjustment. The PEFA loops until the task is completed. Moreover, we create a challenging heterogeneous multi-robot task planning benchmark encompassing 100 complex long-horizon tasks. The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency. The experimental videos, code, and benchmark are released at https://github.com/MrKeee/COHERENT.
comment: 7 pages, 5 figures. Submitted to IEEE International Conference on Robotics and Automation (ICRA), 2025
LingoQA: Video Question Answering for Autonomous Driving ECCV 2024
We introduce LingoQA, a novel dataset and benchmark for visual question answering in autonomous driving. The dataset contains 28K unique short video scenarios, and 419K annotations. Evaluating state-of-the-art vision-language models on our benchmark shows that their performance is below human capabilities, with GPT-4V responding truthfully to 59.6% of the questions compared to 96.6% for humans. For evaluation, we propose a truthfulness classifier, called Lingo-Judge, that achieves a 0.95 Spearman correlation coefficient to human evaluations, surpassing existing techniques like METEOR, BLEU, CIDEr, and GPT-4. We establish a baseline vision-language model and run extensive ablation studies to understand its performance. We release our dataset and benchmark https://github.com/wayveai/LingoQA as an evaluation platform for vision-language models in autonomous driving.
comment: Accepted to ECCV 2024. Benchmark and dataset are available at https://github.com/wayveai/LingoQA/
Learning to Walk and Fly with Adversarial Motion Priors IROS
Robot multimodal locomotion encompasses the ability to transition between walking and flying, representing a significant challenge in robotics. This work presents an approach that enables automatic smooth transitions between legged and aerial locomotion. Leveraging the concept of Adversarial Motion Priors, our method allows the robot to imitate motion datasets and accomplish the desired task without the need for complex reward functions. The robot learns walking patterns from human-like gaits and aerial locomotion patterns from motions obtained using trajectory optimization. Through this process, the robot adapts the locomotion scheme based on environmental feedback using reinforcement learning, with the spontaneous emergence of mode-switching behavior. The results highlight the potential for achieving multimodal locomotion in aerial humanoid robotics through automatic control of walking and flying modes, paving the way for applications in diverse domains such as search and rescue, surveillance, and exploration missions. This research contributes to advancing the capabilities of aerial humanoid robots in terms of versatile locomotion in various environments.
comment: This paper has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, 2024
DroneWiS: Automated Simulation Testing of small Unmanned Aerial Systems in Realistic Windy Conditions
The continuous evolution of small Unmanned Aerial Systems (sUAS) demands advanced testing methodologies to ensure their safe and reliable operations in the real-world. To push the boundaries of sUAS simulation testing in realistic environments, we previously developed the DroneReqValidator (DRV) platform, allowing developers to automatically conduct simulation testing in digital twin of earth. In this paper, we present DRV 2.0, which introduces a novel component called DroneWiS (Drone Wind Simulation). DroneWiS allows sUAS developers to automatically simulate realistic windy conditions and test the resilience of sUAS against wind. Unlike current state-of-the-art simulation tools such as Gazebo and AirSim that only simulate basic wind conditions, DroneWiS leverages Computational Fluid Dynamics (CFD) to compute the unique wind flows caused by the interaction of wind with the objects in the environment such as buildings and uneven terrains. This simulation capability provides deeper insights to developers about the navigation capability of sUAS in challenging and realistic windy conditions. DroneWiS equips sUAS developers with a powerful tool to test, debug, and improve the reliability and safety of sUAS in real-world. A working demonstration is available at https://youtu.be/khBHEBST8Wc
RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos ECCV 2024
Procedure Planning in instructional videos entails generating a sequence of action steps based on visual observations of the initial and target states. Despite the rapid progress in this task, there remain several critical challenges to be solved: (1) Adaptive procedures: Prior works hold an unrealistic assumption that the number of action steps is known and fixed, leading to non-generalizable models in real-world scenarios where the sequence length varies. (2) Temporal relation: Understanding the step temporal relation knowledge is essential in producing reasonable and executable plans. (3) Annotation cost: Annotating instructional videos with step-level labels (i.e., timestamp) or sequence-level labels (i.e., action category) is demanding and labor-intensive, limiting its generalizability to large-scale datasets. In this work, we propose a new and practical setting, called adaptive procedure planning in instructional videos, where the procedure length is not fixed or pre-determined. To address these challenges, we introduce Retrieval-Augmented Planner (RAP) model. Specifically, for adaptive procedures, RAP adaptively determines the conclusion of actions using an auto-regressive model architecture. For temporal relation, RAP establishes an external memory module to explicitly retrieve the most relevant state-action pairs from the training videos and revises the generated procedures. To tackle high annotation cost, RAP utilizes a weakly-supervised learning manner to expand the training dataset to other task-relevant, unannotated videos by generating pseudo labels for action steps. Experiments on CrossTask and COIN benchmarks show the superiority of RAP over traditional fixed-length models, establishing it as a strong baseline solution for adaptive procedure planning.
comment: Accepted in ECCV 2024
Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning
Multi-UAV pursuit-evasion, where pursuers aim to capture evaders, poses a key challenge for UAV swarm intelligence. Multi-agent reinforcement learning (MARL) has demonstrated potential in modeling cooperative behaviors, but most RL-based approaches remain constrained to simplified simulations with limited dynamics or fixed scenarios. Previous attempts to deploy RL policy to real-world pursuit-evasion are largely restricted to two-dimensional scenarios, such as ground vehicles or UAVs at fixed altitudes. In this paper, we address multi-UAV pursuit-evasion by considering UAV dynamics and physical constraints. We introduce an evader prediction-enhanced network to tackle partial observability in cooperative strategy learning. Additionally, we propose an adaptive environment generator within MARL training, enabling higher exploration efficiency and better policy generalization across diverse scenarios. Simulations show our method significantly outperforms all baselines in challenging scenarios, generalizing to unseen scenarios with a 100% capture rate. Finally, we derive a feasible policy via a two-stage reward refinement and deploy the policy on real quadrotors in a zero-shot manner. To our knowledge, this is the first work to derive and deploy an RL-based policy using collective thrust and body rates control commands for multi-UAV pursuit-evasion in unknown environments. The open-source code and videos are available at https://sites.google.com/view/pursuit-evasion-rl.
Event-Free Moving Object Segmentation from Moving Ego Vehicle
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.
Mamba as a motion encoder for robotic imitation learning
Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. This paper proposes using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
comment: 8 pages, 9 figures
An explicit construction of Kaleidocycles by elliptic theta functions
We consider the configuration space of points on the two-dimensional sphere that satisfy a specific system of quadratic equations. We construct periodic orbits in this configuration space using elliptic theta functions and show that they satisfy semi-discrete analogues of mKdV and sine-Gordon equations. The configuration space we investigate corresponds to the state space of a linkage mechanism known as the Kaleidocycle, and the constructed orbits describe the characteristic motion of the Kaleidocycle. Our approach is founded on the relationship between the deformation of spatial curves and integrable systems, offering an intriguing example where an integrable system generates an orbit in the space of real solutions to polynomial equations defined by geometric constraints.
Cosserat Rods for Modeling Tendon-Driven Robotic Catheter Systems
Tendon-driven robotic catheters are capable of precise execution of minimally invasive cardiac procedures including ablations and imaging. These procedures require accurate mathematical models of not only the catheter and tendons but also their interactions with surrounding tissue and vasculature in order to control the robot path and interaction. This paper presents a mechanical model of a tendon-driven robotic catheter system based on Cosserat rods and integrated with a stable, implicit Euler scheme. We implement the Cosserat rod as a model for a simple catheter centerline and validate its physical accuracy against a large deformation analytical model and experimental data. The catheter model is then supplemented by adding a second Cosserat rod to model a single tendon, using penalty forces to define the constraints of the tendon-catheter system. All the model parameters are defined by the catheter properties established by the design. The combined model is validated against experimental data to confirm its physical accuracy. This model represents a new contribution to the field of robotic catheter modeling in which both the tendons and catheter are modeled by mechanical Cosserat rods and fully-validated against experimental data in the case of the single rod system.
comment: 24 pages, 23 figures
TempFuser: Learning Agile, Tactical, and Acrobatic Flight Maneuvers Using a Long Short-Term Temporal Fusion Transformer
Dogfighting is a challenging scenario in aerial applications that requires a comprehensive understanding of both strategic maneuvers and the aerodynamics of agile aircraft. The aerial agent needs to not only understand tactically evolving maneuvers of fighter jets from a long-term perspective but also react to rapidly changing aerodynamics of aircraft from a short-term viewpoint. In this paper, we introduce TempFuser, a novel long short-term temporal fusion transformer architecture that can learn agile, tactical, and acrobatic flight maneuvers in complex dogfight problems. Our approach integrates two distinct temporal transition embeddings into a transformer-based network to comprehensively capture both the long-term tactics and short-term agility of aerial agents. By incorporating these perspectives, our policy network generates end-to-end flight commands that secure dominant positions over the long term and effectively outmaneuver agile opponents. After training in a high-fidelity flight simulator, our model successfully learns to execute strategic maneuvers, outperforming baseline policy models against various types of opponent aircraft. Notably, our model exhibits human-like acrobatic maneuvers even when facing adversaries with superior specifications, all without relying on prior knowledge. Moreover, it demonstrates robust pursuit performance in challenging supersonic and low-altitude situations. Demo videos are available at https://sites.google.com/view/tempfuser.
comment: 8 pages, 7 figures. Accepted for publication in IEEE Robotics and Automation Letters (RA-L). Copyright 2024 IEEE. Personal use is permitted. For other uses, permission from IEEE is required
ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots
To substantially enhance robot intelligence, there is a pressing need to develop a large model that enables general-purpose robots to proficiently undertake a broad spectrum of manipulation tasks, akin to the versatile task-planning ability exhibited by LLMs. The vast diversity in objects, robots, and manipulation tasks presents huge challenges. Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation that formalizes a manipulation task as contact synthesis. Specifically, our model takes as input object and robot manipulator point clouds, object physical attributes, target motions, and manipulation region masks. It outputs contact points on the object and associated contact forces or post-contact motions for robots to achieve the desired manipulation task. We perform extensive experiments both in the simulation and real-world settings, manipulating articulated rigid objects, rigid objects, and deformable objects that vary in dimensionality, ranging from one-dimensional objects like ropes to two-dimensional objects like cloth and extending to three-dimensional objects such as plasticine. Our model achieves average success rates of around 90\%. Supplementary materials and videos are available on our project website at https://manifoundationmodel.github.io/.
EF-Calib: Spatiotemporal Calibration of Event- and Frame-Based Cameras Using Continuous-Time Trajectories
Event camera, a bio-inspired asynchronous triggered camera, offers promising prospects for fusion with frame-based cameras owing to its low latency and high dynamic range. However, calibrating stereo vision systems that incorporate both event and frame-based cameras remains a significant challenge. In this letter, we present EF-Calib, a spatiotemporal calibration framework for event- and frame-based cameras using continuous-time trajectories. A novel calibration pattern applicable to both camera types and the corresponding event recognition algorithm is proposed. Leveraging the asynchronous nature of events, a derivable piece-wise B-spline to represent camera pose continuously is introduced, enabling calibration for intrinsic parameters, extrinsic parameters, and time offset, with analytical Jacobians provided. Various experiments are carried out to evaluate the calibration performance of EF-Calib, including calibration experiments for intrinsic parameters, extrinsic parameters, and time offset. Experimental results show that EF-Calib achieves the most accurate intrinsic parameters compared to current SOTA, the close accuracy of the extrinsic parameters compared to the frame-based results, and accurate time offset estimation. EF-Calib provides a convenient and accurate toolbox for calibrating the system that fuses events and frames. The code of this paper will also be open-sourced at: https://github.com/wsakobe/EF-Calib.
comment: Accepted by IEEE Robotics and Automation Letters
D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation
Depth sensing is an important problem for 3D vision-based robotics. Yet, a real-world active stereo or ToF depth camera often produces noisy and incomplete depth which bottlenecks robot performances. In this work, we propose D3RoMa, a learning-based depth estimation framework on stereo image pairs that predicts clean and accurate depth in diverse indoor scenes, even in the most challenging scenarios with translucent or specular surfaces where classical depth sensing completely fails. Key to our method is that we unify depth estimation and restoration into an image-to-image translation problem by predicting the disparity map with a denoising diffusion probabilistic model. At inference time, we further incorporated a left-right consistency constraint as classifier guidance to the diffusion process. Our framework combines recently advanced learning-based approaches and geometric constraints from traditional stereo vision. For model training, we create a large scene-level synthetic dataset with diverse transparent and specular objects to compensate for existing tabletop datasets. The trained model can be directly applied to real-world in-the-wild scenes and achieve state-of-the-art performance in multiple public depth estimation benchmarks. Further experiments in real environments show that accurate depth prediction significantly improves robotic manipulation in various scenarios.
MHRC: Closed-loop Decentralized Multi-Heterogeneous Robot Collaboration with Large Language Models
The integration of large language models (LLMs) with robotics has significantly advanced robots' abilities in perception, cognition, and task planning. The use of natural language interfaces offers a unified approach for expressing the capability differences of heterogeneous robots, facilitating communication between them, and enabling seamless task allocation and collaboration. Currently, the utilization of LLMs to achieve decentralized multi-heterogeneous robot collaborative tasks remains an under-explored area of research. In this paper, we introduce a novel framework that utilizes LLMs to achieve decentralized collaboration among multiple heterogeneous robots. Our framework supports three robot categories, mobile robots, manipulation robots, and mobile manipulation robots, working together to complete tasks such as exploration, transportation, and organization. We developed a rich set of textual feedback mechanisms and chain-of-thought (CoT) prompts to enhance task planning efficiency and overall system performance. The mobile manipulation robot can adjust its base position flexibly, ensuring optimal conditions for grasping tasks. The manipulation robot can comprehend task requirements, seek assistance when necessary, and handle objects appropriately. Meanwhile, the mobile robot can explore the environment extensively, map object locations, and communicate this information to the mobile manipulation robot, thus improving task execution efficiency. We evaluated the framework using PyBullet, creating scenarios with three different room layouts and three distinct operational tasks. We tested various LLM models and conducted ablation studies to assess the contributions of different modules. The experimental results confirm the effectiveness and necessity of our proposed framework.
Design, Integration, and Field Evaluation of a Robotic Blossom Thinning System for Tree Fruit Crops
The US apple industry relies heavily on semi-skilled manual labor force for essential field operations such as training, pruning, blossom and green fruit thinning, and harvesting. Blossom thinning is one of the crucial crop load management practices to achieve desired crop load, fruit quality, and return bloom. While several techniques such as chemical, and mechanical thinning are available for large-scale blossom thinning such approaches often yield unpredictable thinning results and may cause damage the canopy, spurs, and leaf tissue. Hence, growers still depend on laborious, labor intensive and expensive manual hand blossom thinning for desired thinning outcomes. This research presents a robotic solution for blossom thinning in apple orchards using a computer vision system with artificial intelligence, a six degrees of freedom robotic manipulator, and an electrically actuated miniature end-effector for robotic blossom thinning. The integrated robotic system was evaluated in a commercial apple orchard which showed promising results for targeted and selective blossom thinning. Two thinning approaches, center and boundary thinning, were investigated to evaluate the system ability to remove varying proportion of flowers from apple flower clusters. During boundary thinning the end effector was actuated around the cluster boundary while center thinning involved end-effector actuation only at the cluster centroid for a fixed duration of 2 seconds. The boundary thinning approach thinned 67.2% of flowers from the targeted clusters with a cycle time of 9.0 seconds per cluster, whereas center thinning approach thinned 59.4% of flowers with a cycle time of 7.2 seconds per cluster. When commercially adopted, the proposed system could help address problems faced by apple growers with current hand, chemical, and mechanical blossom thinning approaches.
comment: Accepted for publication in the Journal of Field Robotics
Multiagent Systems
Offline and Distributional Reinforcement Learning for Radio Resource Management
Reinforcement learning (RL) has proved to have a promising role in future intelligent wireless networks. Online RL has been adopted for radio resource management (RRM), taking over traditional schemes. However, due to its reliance on online interaction with the environment, its role becomes limited in practical, real-world problems where online interaction is not feasible. In addition, traditional RL stands short in front of the uncertainties and risks in real-world stochastic environments. In this manner, we propose an offline and distributional RL scheme for the RRM problem, enabling offline training using a static dataset without any interaction with the environment and considering the sources of uncertainties using the distributions of the return. Simulation results demonstrate that the proposed scheme outperforms conventional resource management models. In addition, it is the only scheme that surpasses online RL and achieves a $16 \%$ gain over online RL.
Decentralized Nonlinear Model Predictive Control for Safe Collision Avoidance in Quadrotor Teams with Limited Detection Range ICRA
Multi-quadrotor systems face significant challenges in decentralized control, particularly with safety and coordination under sensing and communication limitations. State-of-the-art methods leverage Control Barrier Functions (CBFs) to provide safety guarantees but often neglect actuation constraints and limited detection range. To address these gaps, we propose a novel decentralized Nonlinear Model Predictive Control (NMPC) that integrates Exponential CBFs (ECBFs) to enhance safety and optimality in multi-quadrotor systems. We provide both conservative and practical minimum bounds of the range that preserve the safety guarantees of the ECBFs. We validate our approach through extensive simulations with up to 10 quadrotors and 20 obstacles, as well as real-world experiments with 3 quadrotors. Results demonstrate the effectiveness of the proposed framework in realistic settings, highlighting its potential for reliable quadrotor teams operations.
comment: 7 pages, 5 figures, Submitted to the IEEE International Conference on Robotics and Automation (ICRA) 2025
Language Grounded Multi-agent Communication for Ad-hoc Teamwork
Multi-Agent Reinforcement Learning (MARL) methods have shown promise in enabling agents to learn a shared communication protocol from scratch and accomplish challenging team tasks. However, the learned language is usually not interpretable to humans or other agents not co-trained together, limiting its applicability in ad-hoc teamwork scenarios. In this work, we propose a novel computational pipeline that aligns the communication space between MARL agents with an embedding space of human natural language by grounding agent communications on synthetic data generated by embodied Large Language Models (LLMs) in interactive teamwork scenarios. Our results demonstrate that introducing language grounding not only maintains task performance but also accelerates the emergence of communication. Furthermore, the learned communication protocols exhibit zero-shot generalization capabilities in ad-hoc teamwork scenarios with unseen teammates and novel task states. This work presents a significant step toward enabling effective communication and collaboration between artificial agents and humans in real-world teamwork settings.
comment: Accepted to Neurips 2024, 16 pages, 3 figures
Grounded Predictions of Teamwork as a One-Shot Game: A Multiagent Multi-Armed Bandits Approach
Humans possess innate collaborative capacities. However, effective teamwork often remains challenging. This study delves into the feasibility of collaboration within teams of rational, self-interested agents who engage in teamwork without the obligation to contribute. Drawing from psychological and game theoretical frameworks, we formalise teamwork as a one-shot aggregative game, integrating insights from Steiner's theory of group productivity. We characterise this novel game's Nash equilibria and propose a multiagent multi-armed bandit system that learns to converge to approximations of such equilibria. Our research contributes value to the areas of game theory and multiagent systems, paving the way for a better understanding of voluntary collaborative dynamics. We examine how team heterogeneity, task typology, and assessment difficulty influence agents' strategies and resulting teamwork outcomes. Finally, we empirically study the behaviour of work teams under incentive systems that defy analytical treatment. Our agents demonstrate human-like behaviour patterns, corroborating findings from social psychology research.
Plurals: A System for Guiding LLMs Via Simulated Social Ensembles
Recent debates raised concerns that language models may favor certain viewpoints. But what if the solution is not to aim for a 'view from nowhere' but rather to leverage different viewpoints? We introduce Plurals, a system and Python library for pluralistic AI deliberation. Plurals consists of Agents (LLMs, optionally with personas) which deliberate within customizable Structures, with Moderators overseeing deliberation. Plurals is a generator of simulated social ensembles. Plurals integrates with government datasets to create nationally representative personas, includes deliberation templates inspired by democratic deliberation theory, and allows users to customize both information-sharing structures and deliberation behavior within Structures. Six case studies demonstrate fidelity to theoretical constructs and efficacy. Three randomized experiments show simulated focus groups produced output resonant with an online sample of the relevant audiences (chosen over zero-shot generation in 75% of trials). Plurals is both a paradigm and a concrete system for pluralistic AI. The Plurals library is available at https://github.com/josh-ashkinaze/plurals and will be continually updated.
Opponent Shaping for Antibody Development
Anti-viral therapies are typically designed to target the current strains of a virus. Game theoretically, this corresponds to a short-sighted, or myopic, response. However, therapy-induced selective pressures act on viral antigens to drive the emergence of mutated strains, against which initial therapies have reduced efficacy. Building on a computational model of binding between antibodies and viral antigens (the Absolut! framework), we design and implement a genetic simulation of such viral evolutionary escape. Crucially, this allows our antibody optimisation algorithm to consider and influence the entire escape curve of the virus, i.e. to guide (or ''shape'') the viral evolution. This is inspired by opponent shaping which, in general-sum learning, accounts for the adaptation of the co-player rather than playing a myopic best response. Hence we call the optimised antibodies shapers. Within our simulations, we demonstrate that our shapers target both current and simulated future viral variants, outperforming the antibodies chosen in a myopic way. Furthermore, we show that shapers exert specific evolutionary pressure on the virus compared to myopic antibodies. Altogether, shapers modify the evolutionary trajectories of viral strains and minimise the viral escape compared to their myopic counterparts. While this is a simplified model, we hope that our proposed paradigm will enable the discovery of better long-lived vaccines and antibody therapies in the future, enabled by rapid advancements in the capabilities of simulation tools. Our code is available at https://github.com/olakalisz/antibody-shapers.
comment: Preprint
Extending Stable and Popular Matching Algorithms from Bipartite to Arbitrary Instances
We consider stable and popular matching problems in arbitrary graphs, which are referred to as stable roommates instances. We extend the 3/2-approximation algorithm for the maximum size weakly stable matching problem to the roommates case, which solves a more than 20 year old open question of Irving and Manlove about the approximability of maximum size weakly stable matchings in roommates instances with ties [Irving and Manlove 2002] and has nice applications for the problem of matching residents to hospitals in the presence of couples. We also extend the algorithm that finds a maximum size popular matching in bipartite graphs in the case of strict preferences and the algorithm to find a popular matching among maximum weight matchings. While previous attempts to extend the idea of promoting the agents or duplicating the edges from bipartite instances to arbitrary ones failed, these results show that with the help of a simple observation, we can indeed bridge the gap and extend these algorithms
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale
Multi-agent pathfinding (MAPF) is a challenging computational problem that typically requires to find collision-free paths for multiple agents in a shared environment. Solving MAPF optimally is NP-hard, yet efficient solutions are critical for numerous applications, including automated warehouses and transportation systems. Recently, learning-based approaches to MAPF have gained attention, particularly those leveraging deep reinforcement learning. Following current trends in machine learning, we have created a foundation model for the MAPF problems called MAPF-GPT. Using imitation learning, we have trained a policy on a set of pre-collected sub-optimal expert trajectories that can generate actions in conditions of partial observability without additional heuristics, reward functions, or communication with other agents. The resulting MAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF problem instances that were not present in the training dataset. We show that MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers on a diverse range of problem instances and is efficient in terms of computation (in the inference mode).
Towards Autonomous Supply Chains: Definition, Characteristics, Conceptual Framework, and Autonomy Levels
Recent global disruptions, such as the pandemic and geopolitical conflicts, have profoundly exposed vulnerabilities in traditional supply chains, requiring exploration of more resilient alternatives. Autonomous supply chains (ASCs) have emerged as a potential solution, offering increased visibility, flexibility, and resilience in turbulent trade environments. Despite discussions in industry and academia over several years, ASCs lack well-established theoretical foundations. This paper addresses this research gap by presenting a formal definition of ASC along with its defining characteristics and auxiliary concepts. We propose a layered conceptual framework called the MIISI model. An illustrative case study focusing on the meat supply chain demonstrates an initial ASC implementation based on this conceptual model. Additionally, we introduce a seven-level supply chain autonomy reference model, delineating a trajectory towards achieving a full supply chain autonomy. Recognising that this work represents an initial endeavour, we emphasise the need for continued exploration in this emerging domain. We anticipate that this work will stimulate further research, both theoretical and technical, and contribute to the continual evolution of ASCs.
comment: This paper includes 19 pages and 8 figures and has been accepted for publication in the Journal of Industrial Information Integration
Systems and Control (CS)
Learning with Dynamics: Autonomous Regulation of UAV Based Communication Networks with Dynamic UAV Crew
Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) are a key component in future mobile networking. To handle the dynamic environments in UCNs, reinforcement learning (RL) has been a promising solution attributed to its strong capability of adaptive decision-making free of the environment models. However, most existing RL-based research focus on control strategy design assuming a fixed set of UAVs. Few works have investigated how UCNs should be adaptively regulated when the serving UAVs change dynamically. This article discusses RL-based strategy design for adaptive UCN regulation given a dynamic UAV set, addressing both reactive strategies in general UCNs and proactive strategies in solar-powered UCNs. An overview of the UCN and the RL framework is first provided. Potential research directions with key challenges and possible solutions are then elaborated. Some of our recent works are presented as case studies to inspire innovative ways to handle dynamic UAV crew with different RL algorithms.
comment: 7 pages, 6 figures, magazine paper
Complex-Phase, Data-Driven Identification of Grid-Forming Inverter Dynamics
The increasing integration of renewable energy sources (RESs) into power systems requires the deployment of grid-forming inverters to ensure a stable operation. Accurate modeling of these devices is necessary. In this paper, a system identification approach to obtain low-dimensional models of grid-forming inverters is presented. The proposed approach is based on a Hammerstein-Wiener parametrization of the normal-form model. The normal-form is a gray-box model that utilizes complex frequency and phase to capture non-linear inverter dynamics. The model is validated on two well-known control strategies: droop-control and dispatchable virtual oscillators. Simulations and hardware-in-the-loop experiments demonstrate that the normal-form accurately models inverter dynamics across various operating conditions. The approach shows great potential for enhancing the modeling of RES-dominated power systems, especially when component models are unavailable or computationally expensive.
Towards human-like kinematics in industrial robotic arms: a case study on a UR3 robot
Safety in industrial robotic environments is a hot research topic in the area of human-robot interaction (HRI). Up to now, a robotic arm on an assembly line interacts with other machines away from human workers. Nowadays, robotic arm manufactures are aimed to their robots could increasingly perform tasks collaborating with humans. One of the ways to improve this collaboration is by making the movement of robots more humanlike. This way, it would be easier for a human to foresee the movement of the robot and approach it without fear of contact. The main difference between the movement of a human and of a robotic arm is that the former has a bell-shaped speed profile while the latter has a uniform speed one. To generate this speed profile, the kinematic theory of rapid human movements and its Sigma-Lognormal model has been used. This model is widely used to explain most of the basic phenomena related to the control of human movements. Both human-like and robotic-like movements are transferred to the UR3 robot. In this paper we detail the how the UR3 robot was programmed to produce both kinds of movement. The dissimilarities result between the input motion and output motion to the robot confirm the possibility to develop human-like velocities in the UR3 robot.
comment: 6 pages, 5 figures
Generic Diagonalizability, Structural Functional Observability and Output Controllability
This paper investigates the structural functional observability (SFO) and structural output controllability (SOC) of a class of systems with generically diagonalizable state matrices and explores the associated minimal sensor and actuator placement problems. The verification of SOC and the corresponding sensor and actuator placement problems, i.e., the problems of determining the minimum number of outputs and inputs required to achieve SFO and SOC, respectively, are yet open for general systems, which motivates our focus on a class of systems enabling polynomial-time solutions. In this line, we first define and characterize generically diagonalizable systems, referring to structured systems for which almost all realizations of the state matrices are diagonalizable. We then develop computationally efficient criteria for SFO and SOC within the context of generically diagonalizable systems. Our work expands the class of systems amenable to polynomial-time SOC verification. Thanks to the simplicity of the obtained criteria, we derive closed-form solutions for determining the minimal sensor placement to achieve SFO and the minimal actuator deployment to achieve SOC in such systems, along with efficient weighted maximum matching based and weighted maximum flow based algorithms. For more general systems to achieve SFO, an upper bound is given by identifying a non-decreasing property of SFO with respect to a specific class of edge additions, which is shown to be optimal under certain circumstances.
comment: Under review in a Journal
Energy efficiency analysis as a function of the working voltages in supercapacitors
Supercapacitors are increasingly used as energy storage elements. Unlike batteries, their state of charge has a considerable influence on their voltage in normal operation, allowing them to work from zero to their maximum voltage. In this work, a theoretical and practical analysis is proposed of the energy efficiency of these devices according to their working voltages. To this end, several supercapacitors were subjected to charge and discharge cycles until the measurements of current and voltage stabilized. At this point their energy efficiency was calculated. These charge-discharge cycles were carried out: i) without rest between charging and discharging; and ii) with a rest of several minutes between the two stages. Using the information obtained from the tests, the energy efficiency is shown plotted against the minimum and maximum working voltages. By consulting the data and the graphs, the ideal working voltages to optimize the energy efficiency of these devices can be obtained.
comment: 18 pages, 10 figures
A Novel MOSFET based Single Event Latchup Detection, Current Limiting & Self Power Cycling circuit for Spacecraft systems
Single Event Latch-up (SEL) is one of the prime concerns for CMOS ICs used in space systems. Galactic Cosmic Rays or Solar Energetic Particles (SEP) may trigger the parasitic latch up circuit in CMOS ICs and cause increase in current beyond the safe limits thereby presenting a threat of permanent failure of the IC. Mitigation of the SEL is always a challenging task. The conventional mitigation approaches inherently introduce some response time which presents an uncertainty because during this response time the current may exceed the safe current limits. This paper presents a novel circuit based on MOSFETs which provides end-to-end complete solution of detecting SEL, limiting the current below the set threshold and executing power cycling to restore the normal functioning of the CMOS IC. The proposed circuit has been simulated in MULTISIM and the simulation results match very well with the expected behavior of (i)current limiting and (ii) the total time duration taken in power cycling to bring the SEL sensitive device back to its normal operational state. This circuit can be harnessed by spacecraft system designers to overcome the catastrophic threat of SEL posed by space radiation environment.
The Power-Oriented Graphs Modeling Technique: From the Fundamental Principles to the Systematic, Step-by-Step Modeling of Complex Physical Systems
Modeling physical systems is an essential skill for a control engineer, since it enables to achieve a deep understanding of their dynamic behavior and, consequently, the development of effective control strategies. The first part of this article provides a tutorial description of the fundamental principles and properties of the Power-Oriented Graphs (POG) modeling technique. Various case studies in different energetic domains are then presented to consolidate the fundamental principles, each highlighting different features of the POG modeling technique. The latter is then compared with the other two main graphical modeling techniques available in the literature, namely Bond Graph (BG) and Energetic Macroscopic Representation (EMR). The second part of this article assumes once again a tutorial nature, in order to introduce the new Fast Modeling POG (FMPOG) procedure. The FMPOG, which operates in the POG framework, is a methodical step-by-step procedure that enables the readers to quickly derive the power-oriented graphical model of physical systems starting from their schematics. From the power-oriented graphical model, the state-space model can then be directly determined. To ensure the FMPOG procedure is easily usable by the entire community, we apply it to three examples in different energetic domains in this article, guiding the reader step-by-step through the derivation of the physical systems models.
Feedforward Controllers from Learned Dynamic Local Model Networks with Application to Excavator Assistance Functions
Complicated first principles modelling and controller synthesis can be prohibitively slow and expensive for high-mix, low-volume products such as hydraulic excavators. Instead, in a data-driven approach, recorded trajectories from the real system can be used to train local model networks (LMNs), for which feedforward controllers are derived via feedback linearization. However, previous works required LMNs without zero dynamics for feedback linearization, which restricts the model structure and thus modelling capacity of LMNs. In this paper, we overcome this restriction by providing a criterion for when feedback linearization of LMNs with zero dynamics yields a valid controller. As a criterion we propose the bounded-input bounded-output stability of the resulting controller. In two additional contributions, we extend this approach to consider measured disturbance signals and multiple inputs and outputs. We illustrate the effectiveness of our contributions in a hydraulic excavator control application with hardware experiments. To this end, we train LMNs from recorded, noisy data and derive feedforward controllers used as part of a leveling assistance system on the excavator. In our experiments, incorporating disturbance signals and multiple inputs and outputs enhances tracking performance of the learned controller. A video of our experiments is available at https://youtu.be/lrrWBx2ASaE.
Measurements and System Identification for the Characterization of Smooth Muscle Cell Dynamics
Biological tissue integrity is actively maintained by cells. It is essential to comprehend how cells accomplish this in order to stage tissue diseases. However, addressing the complexity of a cell's system of interrelated mechanisms poses a challenge. This necessitates a well-structured identification framework and an effective integration of measurements. Here we introduce the use of state-of-the-art frequency-domain system identification techniques combined with an indentation measurement platform to analyze the underlying mechanisms from the perspective of control system theory. The ultimate goal is to explore how mechanical and biological factors are related in induced Pluripotent Stem Cell-derived vascular smooth muscle cells. We study on the frequency-domain analysis for the investigation and characterization of cellular dynamics of smooth muscle cells from the measured data. The measurement model in this study exploits the availability of human tissue and samples, enabling fundamental investigations of vascular tissue disease. This approach using human cell lines holds significant potential to decrease the necessity for animal-based safety and efficacy studies. The focus of this review is to investigate the cellular dynamics underlying the myogenic response and to demonstrate the practicability of employing a nano-indentation measurement setup for the broadband frequency-domain characterization of induced Pluripotent Stem Cell-derived vascular smooth muscle cells.
comment: 6 pages, 9 figures, presented in the Medical Measurements and Applications - MeMeA2024 conference
Performance Boundary Analyses for Statistical Multi-QoS Framework Over 6G SAGINs
To enable the cost-effective universal access and the enhancement of current communication services, the space-air-ground integrated networks (SAGINs) have recently been developed due to its exceptional 3D coverage and the ability to guarantee rigorous and multidimensional demands for quality-of-service (QoS) provisioning, including delay and reliability across vast distances. In response to the complex, heterogeneous, and dynamic serving scenarios and stringent performance expectations for 6G SAGINs, it is crucial to undertake modeling, assurance, and analysis of the key technologies, aligned with the diverse demands for QoS provisioning in the non-asymptotic regime, i.e., when implementing finite blocklength coding (FBC) as a new dimension for error-rate bounded QoS metric. However, how to design new statistical QoS-driven performance modeling approaches that accurately delineate the complex and dynamic behaviors of networks, particularly in terms of constraining both delay and error rate, persists as a significant challenge for implementing mURLLC within 6G SAGINs in the finite blocklength regime. To overcome these difficulties, in this paper we propose to develop a set of analytical modeling frameworks for 6G SAGIN in supporting statistical delay and error-rate bounded QoS in the finite blocklength regime. First we establish the SAGIN system architecture model. Second, the aggregate interference and decoding error probability functions are modeled and examined through using Laplace transform. Third, we introduce modeling techniques aimed at defining the$\epsilon$-effective capacity function as a crucial metric for facilitating statistical QoS standards with respect to delay and error-rate. To validate the effectiveness of the developed performance modeling schemes, we have executed a series of simulations over SAGINs.
Inline Photometrically Calibrated Hybrid Visual SLAM
This paper presents an integrated approach to Visual SLAM, merging online sequential photometric calibration within a Hybrid direct-indirect visual SLAM (H-SLAM). Photometric calibration helps normalize pixel intensity values under different lighting conditions, and thereby improves the direct component of our H-SLAM. A tangential benefit also results to the indirect component of H-SLAM given that the detected features are more stable across variable lighting conditions. Our proposed photometrically calibrated H-SLAM is tested on several datasets, including the TUM monoVO as well as on a dataset we created. Calibrated H-SLAM outperforms other state of the art direct, indirect, and hybrid Visual SLAM systems in all the experiments. Furthermore, in online SLAM tested at our site, it also significantly outperformed the other SLAM Systems.
Distributed Robust Optimization Method for AC/MTDC Hybrid Power Systems with DC Network Cognizance
AC/multi-terminal DC (MTDC) hybrid power systems have emerged as a solution for the large-scale and longdistance accommodation of power produced by renewable energy systems (RESs). To ensure the optimal operation of such hybrid power systems, this paper addresses three key issues: system operational flexibility, centralized communication limitations, and RES uncertainties. Accordingly, a specific AC/DC optimal power flow (OPF) model and a distributed robust optimization method are proposed. Firstly, we apply a set of linear approximation and convex relaxation techniques to formulate the mixed-integer convex AC/DC OPF model. This model incorporates the DC network-cognizant constraint and enables DC topology reconfiguration. Next, generalized Benders decomposition (GBD) is employed to provide distributed optimization. Enhanced approaches are incorporated into GBD to achieve parallel computation and asynchronous updating. Additionally, the extreme scenario method (ESM) is embedded into the AC/DC OPF model to provide robust decisions to hedge against RES uncertainties. ESM is further extended to align the GBD procedure. Numerical results are finally presented to validate the effectiveness of our proposed method.
Adaptive Single-Terminal Fault Location for DC Microgrids
Identifying faulty lines and their accurate location is key for rapidly restoring distribution systems. This will become a greater challenge as the penetration of power electronics increases, and contingencies are seen across larger areas. This paper proposes a single terminal methodology (i.e., no communication involved) that is robust to variations of key parameters (e.g., sampling frequency, system parameters, etc.) and performs particularly well for low resistance faults that constitute the majority of faults in low voltage DC systems. The proposed method uses local measurements to estimate the current caused by the other terminals affected by the contingency. This mimics the strategy followed by double terminal methods that require communications and decouples the accuracy of the methodology from the fault resistance. The algorithm takes consecutive voltage and current samples, including the estimated current of the other terminal, into the analysis. This mathematical methodology results in a better accuracy than other single-terminal approaches found in the literature. The robustness of the proposed strategy against different fault resistances and locations is demonstrated using MATLAB simulations.
comment: SEST 2024
Event-Triggered Non-Linear Control of Offshore MMC Grids for Asymmetrical AC Faults
Fault ride-through capability studies of MMC-HVDC connected wind power plants have focused primarily on the DC link and onshore AC grid faults. Offshore AC faults, mainly asymmetrical faults have not gained much attention in the literature despite being included in the future development at national levels in the ENTSO-E HVDC code. The proposed work gives an event-triggered control to stabilize the system once the offshore AC fault has occurred, identified, and isolated. Different types of control actions such as proportional-integral (PI) controller and super-twisted sliding mode control (STSMC) are used to smoothly transition the post-fault system to a new steady state operating point by suppressing the negative sequence control. Initially, the effect of a negative sequence current control scheme on the transient behavior of the power system with a PI controller is discussed in this paper. Further, a non-linear control strategy (STSMC) is proposed which gives quicker convergence of the system post-fault in comparison to PI control action. These post-fault control operations are only triggered in the presence of a fault in the system, i.e., they are event-triggered. The validity of the proposed strategy is demonstrated by simulation on a $\pm$525 kV, three-terminal meshed MMC-HVDC system model in Real Time Digital Simulator (RTDS).
The Bayesian Separation Principle for Data-driven Control
This paper investigates the existence of a separation principle between model identification and control design in the context of model predictive control. First, we elucidate that the separation principle holds asymptotically in the number of data in a Fisherian setting, and universally in a Bayesian setting. Then, by formulating model predictive control within a Gaussian regression framework, we describe how the Bayesian separation principle can be used to derive explicit, uncertainty-aware expressions for the control cost and optimal input sequence, thereby bridging direct and indirect data-driven approaches.
comment: 13 pages, 1 figure
Stochastic Shortest Path Problem with Failure Probability
We solve a sequential decision-making problem under uncertainty that takes into account the failure probability of a task. This problem cannot be handled by the stochastic shortest path problem, which is the standard model for sequential decision-making. This problem is addressed by introducing dead-ends. Conventionally, we only consider policies that minimize the probability of task failure, so the optimal policy constructed could be overly conservative. In this paper, we address this issue by expanding the search range to a class of policies whose failure probability is less than a desired threshold. This problem can be solved by treating it as a framework of a Bayesian Markov decision process and a two-person zero-sum game. Also, it can be seen that the optimal policy is expressed in the form of a probability distribution on a set of deterministic policies. We also demonstrate the effectiveness of the proposed methods by applying them to a motion planning problem with obstacle avoidance for a moving robot.
comment: 22 pages, 5 figure
Multirotor Nonlinear Model Predictive Control based on Visual Servoing of Evolving Features
This article presents a Visual Servoing Nonlinear Model Predictive Control (NMPC) scheme for autonomously tracking a moving target using multirotor Unmanned Aerial Vehicles (UAVs). The scheme is developed for surveillance and tracking of contour-based areas with evolving features. NMPC is used to manage input and state constraints, while additional barrier functions are incorporated in order to ensure system safety and optimal performance. The proposed control scheme is designed based on the extraction and implementation of the full dynamic model of the features describing the target and the state variables. Real-time simulations and experiments using a quadrotor UAV equipped with a camera demonstrate the effectiveness of the proposed strategy.
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2025
We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations, so that at runtime it can recover from perturbations outside the training distribution. Additionally, we introduce a novel transformer-based perception encoder that employs multi-view cross-attention and a learned scene query. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing in the CARLA simulator, as well as showing the ability to handle perturbations in both CARLA and NVIDIA's DRIVE Sim.
comment: 7 pages, 6 figures, for ICRA 2025 conference, for associated video file, see https://youtu.be/9FpDFD9aiFU
A Fast Dynamic Internal Predictive Power Scheduling Approach for Power Management in Microgrids
This paper presents a Dynamic Internal Predictive Power Scheduling (DIPPS) approach for optimizing power management in microgrids, particularly focusingon external power exchanges among diverse prosumers. DIPPS utilizes a dynamic objective function with a time-varying binary parameter to control the timing of power transfers to the external grid, facilitated by efficient usage of energy storage for surplus renewable power. The microgrid power scheduling problem is modeled as a mixed-integer nonlinear programmig (MINLP-PS) and subsequently transformed into a mixed-integer linear programming (MILP-PS) optimization through McCormick's relaxation to reduce the computational complexity. A predictive window with 6 data points is solved at an average of 0.92s, a 97.6% improvement over the 38.27s required for the MINLP-PS formulation, implying the numerical feasibility of the DIPPS approach for real-time implementation. Finally, the approach is validated against a static objective using real-world load data across three case studies with different time-varying parameters, demonstrationg the ability of DIPPS to optimize power exchanges and efficiently utilize distributed resources whie shifting the eexternal power transfers to specified time durations.
Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots
Mobile smartphones compactly provide sensors such as cameras, IMUs, GNSS measurement units, and wireless and wired communication channels required for robotics projects. They are affordable, portable, and programmable, which makes them ideal for testing, data acquisition, controlling mobile robots, and many other robotic applications. A robotic system is proposed in this paper, consisting of an Android phone, a microcontroller board attached to the phone via USB, and a remote wireless controller station. In the data acquisition mode, the Android device can record a dataset of a diverse configuration of multiple cameras, IMUs, GNSS units, and external USB ADC channels in the rawest format used for, but not limited to, pose estimation and scene reconstruction applications. In robot control mode, the Android phone, a microcontroller board, and other peripherals constitute the mobile or stationary robotic system. This system is controlled using a remote server connected over Wi-Fi or Bluetooth. Experiments show that although the SLAM and AR applications can utilize the acquired data, the proposed system can pave the way for more advanced algorithms for processing these noisy and sporadic measurements. Moreover, the characteristics of the communication media are studied, and two example robotic projects, which involve controlling a toy car and a quadcopter, are included.
comment: Project repository: https://github.com/m-dayani/robo-platform Youtube Video: https://youtu.be/BTQ4yLB1bak Dataset: https://drive.google.com/drive/folders/1OZqdA1xa-SyJ64qL_TibqhtwhR1fWWrx?usp=sharing
$\mathcal{L}_{1}$ Adaptive Optimizer for Uncertain Time-Varying Convex Optimization
We propose an adaptive method for uncertain time-varying (TV) convex optimization, termed as $\mathcal{L}_{1}$ adaptive optimization ($\mathcal{L}_{1}$-AO). The proposed method uses a baseline TV optimizer with a prediction model, designed for the gradient dynamics to exploit the underlying structure of the temporal correlation. Inspired by $\mathcal{L}_{1}$ adaptive control, the proposed method augments an adaptive update law to estimate and compensate for the uncertainty from the inaccurate prediction in the online implementation. The proposed method provides the performance bounds of the error in the optimization variables and cost function, allowing efficient and reliable optimization for uncertain TV problems.
comment: 8 pages, 3 figures
Device for detection of activity-dependent changes in neural spheroids at MHz and GHz frequencies
Intracellular processes triggered by neural activity include changes in ionic concentrations, protein release, and synaptic vesicle cycling. These processes play significant roles in neurological disorders. The beneficial effects of brain stimulation may also be mediated through intracellular changes. There is a lack of label-free techniques for monitoring activity-dependent intracellular changes. Electromagnetic (EM) waves at frequencies larger than 1x10^6 Hz (1 MHz) were previously used to probe intracellular contents of cells, as cell membrane becomes transparent at this frequency range. EM waves interact with membranes of intracellular organelles, proteins, and water in the MHz-GHz range. In this work, we developed a device for probing the interaction between intracellular contents of active neurons and EM waves. The device used an array of grounded coplanar waveguides (GCPWs) to deliver EM waves to a three-dimensional (3D) spheroid of rat cortical neurons. Neural activity was evoked using optogenetics, with synchronous detection of propagation of EM waves. Broadband measurements were conducted in the MHz-GHz range to track changes in transmission coefficients. Neuronal activity was found to reversibly alter EM wave transmission. Pharmacological suppression of neuronal activity abolished changes in transmission. Time constants of changes in transmission were in the range of seconds to tens of seconds, suggesting the presence of relatively slow, activity-dependent intracellular processes. This study provides the first evidence that EM transmission through neuronal tissue is activity-dependent in MHz-GHz range. Device developed in this work may find future applications in studies of the mechanisms of neurological disorders and the development of new therapies.
On the Interplay of Clustering and Evolution in the Emergence of Epidemic Outbreaks
In an increasingly interconnected world, a key scientific challenge is to examine mechanisms that lead to the widespread propagation of contagions, such as misinformation and pathogens, and identify risk factors that can trigger large-scale outbreaks. Underlying both the spread of disease and misinformation epidemics is the evolution of the contagion as it propagates, leading to the emergence of different strains, e.g., through genetic mutations in pathogens and alterations in the information content. Recent studies have revealed that models that do not account for heterogeneity in transmission risks associated with different strains of the circulating contagion can lead to inaccurate predictions. However, existing results on multi-strain spreading assume that the network has a vanishingly small clustering coefficient, whereas clustering is widely known to be a fundamental property of real-world social networks. In this work, we investigate spreading processes that entail evolutionary adaptations on random graphs with tunable clustering and arbitrary degree distributions. We derive a mathematical framework to quantify the epidemic characteristics of a contagion that evolves as it spreads, with the structure of the underlying network as given via arbitrary {\em joint} degree distributions of single-edges and triangles. To the best of our knowledge, our work is the first to jointly analyze the impact of clustering and evolution on the emergence of epidemic outbreaks. We supplement our theoretical finding with numerical simulations and case studies, shedding light on the impact of clustering on contagion spread.
Precision Aquaculture: An Integrated Computer Vision and IoT Approach for Optimized Tilapia Feeding
Traditional fish farming practices often lead to inefficient feeding, resulting in environmental issues and reduced productivity. We developed an innovative system combining computer vision and IoT technologies for precise Tilapia feeding. Our solution uses real-time IoT sensors to monitor water quality parameters and computer vision algorithms to analyze fish size and count, determining optimal feed amounts. A mobile app enables remote monitoring and control. We utilized YOLOv8 for keypoint detection to measure Tilapia weight from length, achieving \textbf{94\%} precision on 3,500 annotated images. Pixel-based measurements were converted to centimeters using depth estimation for accurate feeding calculations. Our method, with data collection mirroring inference conditions, significantly improved results. Preliminary estimates suggest this approach could increase production up to 58 times compared to traditional farms. Our models, code, and dataset are open-source~\footnote{The code, dataset, and models are available upon reasonable request.
comment: 8 pages, 6 figures, 3 tables, 21th International Conference on Informatics in Control, Automation, and Robotics
Sampling-based Stochastic Data-driven Predictive Control under Data Uncertainty
We present a stochastic constrained output-feedback data-driven predictive control scheme for linear time-invariant systems subject to bounded additive disturbances. The approach uses data-driven predictors based on an extension of Willems' fundamental lemma and requires only a single persistently exciting input-output data trajectory. Compared to current state-of-the-art approaches, we do not rely on availability of exact disturbance data. Instead, we leverage a novel parameterization of the unknown disturbance data considering consistency with the measured data and the system class. This allows for deterministic approximation of the chance constraints in a sampling-based fashion. A robust constraint on the first predicted step enables recursive feasibility, closed-loop constraint satisfaction, and robust asymptotic stability in expectation under standard assumptions. A numerical example demonstrates the efficiency of the proposed control scheme.
GPU-Accelerated DCOPF using Gradient-Based Optimization
DC Optimal Power Flow (DCOPF) is a key operational tool for power system operators, and it is embedded as a subproblem in many challenging optimization problems (e.g., line switching). However, traditional CPU-based solve routines (e.g., simplex) have saturated in speed and are hard to parallelize. This paper focuses on solving DCOPF problems using gradient-based routines on Graphics Processing Units (GPUs), which have massive parallelization capability. To formulate these problems, we pose a Lagrange dual associated with DCOPF (linear and quadratic cost curves), and then we explicitly solve the inner (primal) minimization problem with a dual norm. The resulting dual problem can be efficiently iterated using projected gradient ascent. After solving the dual problem on both CPUs and GPUs to find tight lower bounds, we benchmark against Gurobi and MOSEK, comparing convergence speed and tightness on the IEEE 2000, 4601, and 10000 bus systems. We provide reliable and tight lower bounds for these problems with, at best, 5.4x speedup over a conventional solver.
Identification of Additive Continuous-time Systems in Open and Closed loop
When identifying electrical, mechanical, or biological systems, parametric continuous-time identification methods can lead to interpretable and parsimonious models when the model structure aligns with the physical properties of the system. Traditional linear system identification may not consider the most parsimonious model when relying solely on unfactored transfer functions, which typically result from standard direct approaches. This paper presents a novel identification method that delivers additive models for both open and closed-loop setups. The estimators that are derived are shown to be generically consistent, and can admit the identification of marginally stable additive systems. Numerical simulations show the efficacy of the proposed approach, and its performance in identifying a modal representation of a flexible beam is verified using experimental data.
comment: 15 pages, 6 figures
Instantaneous Frequency Estimation in Unbalanced Systems Using Affine Differential Geometry
The paper discusses the relationships between electrical and affine differential geometry quantities, establishing a link between frequency and time derivatives of voltage, through the utilization of affine geometric invariants. Based on this link, a new instantaneous frequency estimation formula is proposed, which is particularly suited for unbalanced and single-phase systems. Several examples as well as measurements based on two real-world events illustrate the findings of the paper.
Proactive Emergency Collision Avoidance for Automated Driving in Highway Scenarios
Uncertainty in the behavior of other traffic participants is a crucial factor in collision avoidance for automated driving; here, stochastic metrics could avoid overly conservative decisions. This paper introduces a Stochastic Model Predictive Control (SMPC) planner for emergency collision avoidance in highway scenarios to proactively minimize collision risk while ensuring safety through chance constraints. To guarantee that the emergency trajectory can be attained, we incorporate nonlinear tire dynamics in the prediction model of the ego vehicle. Further, we exploit Max-Min-Plus-Scaling (MMPS) approximations of the nonlinearities to avoid conservatism, enforce proactive collision avoidance, and improve computational efficiency in terms of performance and speed. Consequently, our contributions include integrating a dynamic ego vehicle model into the SMPC planner, introducing the MMPS approximation for real-time implementation in emergency scenarios, and integrating SMPC with hybridized chance constraints and risk minimization. We evaluate our SMPC formulation in terms of proactivity and efficiency in various hazardous scenarios. Moreover, we demonstrate the effectiveness of our proposed approach by comparing it with a state-of-the-art SMPC planner and we validate that the generated trajectories can be attained using a high-fidelity vehicle model in IPG CarMaker.
comment: 14 pages, 11 figures, submitted to IEEE Transactions on Control Systems Technology
Mamba as a motion encoder for robotic imitation learning
Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. This paper proposes using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
comment: 8 pages, 9 figures
Model-Free Generic Robust Control for Servo-Driven Actuation Mechanisms with Layered Insight into Energy Conversions
To advance theoretical solutions and address limitations in modeling complex servo-driven actuation systems experiencing high non-linearity and load disturbances, this paper aims to design a practical model-free generic robust control (GRC) framework for these mechanisms. This framework is intended to be applicable across all actuator systems encompassing electrical, hydraulic, or pneumatic servomechanisms, while also functioning within complex interactions among dynamic components and adhering to control input constraints. In this respect, the state-space model of actuator systems is decomposed into smaller subsystems that incorporate the first principle equation of actuator motion dynamics and interactive energy conversion equations. This decomposition operates under the assumption that the comprehensive model of the servo-driven actuator system and energy conversion, uncertainties, load disturbances, and their bounds are unknown. Then, the GRC employs subsystem-based adaptive control strategies for each state-variant subsystem separately. Despite control input constraints and the unknown interactive system model, the GRC-applied actuator mechanism ensures uniform exponential stability and robustness in tracking desired motions. It features straightforward implementation, experimentally evaluated by applying it to two industrial applications.
comment: This work has been submitted for possible publication in the IEEE
SIMBa: System Identification Methods leveraging Backpropagation
This manuscript details and extends the SIMBa toolbox (System Identification Methods leveraging Backpropagation) presented in previous work, which uses well-established Machine Learning tools for discrete-time linear multi-step-ahead state-space System Identification (SI). SIMBa leverages linear-matrix-inequality-based free parametrizations of Schur matrices to guarantee the stability of the identified model by design. In this paper, backed up by novel free parametrizations of Schur matrices, we extend the toolbox to show how SIMBa can incorporate known sparsity patterns or true values of the state-space matrices to identify without jeopardizing stability. We extensively investigate SIMBa's behavior when identifying diverse systems with various properties from both simulated and real-world data. Overall, we find it consistently outperforms traditional stable subspace identification methods, and sometimes significantly, especially when enforcing desired model properties. These results hint at the potential of SIMBa to pave the way for generic structured nonlinear SI. The toolbox is open-sourced on https://github.com/Cemempamoi/simba.
comment: First two authors contributed equally. Submitted to IEEE TCST
An Alternative to Multi-Factor Authentication with a Triple-Identity Authentication Scheme
The existing authentication system has two entry points (i.e., username and password fields) to interact with the outside, but neither of them has a gatekeeper, making the system vulnerable to cyberattacks. In order to ensure the authentication security, the system sets a third entry point and use an external MFA service to guard it. The crux of the problem is that the system has no internal mechanism to guard its own entry points as no identifiers can be defined for the username and password without using any personal information. To solve this problem, we open the hash algorithm of a dual-password login-authentication system to three login credentials. Therefore, the intermediate elements of the algorithm can be used to define an identifier to verify the user identity at each entry point of the system. As a result of the above setup, a triple-identity authentication is established, the key of which is that the readily available user's login name and password are randomly converted into a matrix of meaningless hash elements which are concealed, incommunicable, inaccessible, and independent of personal information. So the identifiers defined using such elements can be used by the system to verify the identities of the user at all the entry points of the system, thereby ensuring the authentication security without relying on MFA services.
comment: 5 pages, 2 figures, 11 conferences
Towards Autonomous Supply Chains: Definition, Characteristics, Conceptual Framework, and Autonomy Levels
Recent global disruptions, such as the pandemic and geopolitical conflicts, have profoundly exposed vulnerabilities in traditional supply chains, requiring exploration of more resilient alternatives. Autonomous supply chains (ASCs) have emerged as a potential solution, offering increased visibility, flexibility, and resilience in turbulent trade environments. Despite discussions in industry and academia over several years, ASCs lack well-established theoretical foundations. This paper addresses this research gap by presenting a formal definition of ASC along with its defining characteristics and auxiliary concepts. We propose a layered conceptual framework called the MIISI model. An illustrative case study focusing on the meat supply chain demonstrates an initial ASC implementation based on this conceptual model. Additionally, we introduce a seven-level supply chain autonomy reference model, delineating a trajectory towards achieving a full supply chain autonomy. Recognising that this work represents an initial endeavour, we emphasise the need for continued exploration in this emerging domain. We anticipate that this work will stimulate further research, both theoretical and technical, and contribute to the continual evolution of ASCs.
comment: This paper includes 19 pages and 8 figures and has been accepted for publication in the Journal of Industrial Information Integration
Stochastic Data-Driven Predictive Control with Equivalence to Stochastic MPC
We propose a data-driven receding-horizon control method dealing with the chance-constrained output-tracking problem of unknown stochastic linear time-invariant (LTI) systems with partial state observation. The proposed method takes into account the statistics of the process noise, the measurement noise and the uncertain initial condition, following an analogous framework to Stochastic Model Predictive Control (SMPC), but does not rely on the use of a parametric system model. As such, our receding-horizon algorithm produces a sequence of closed-loop control policies for predicted time steps, as opposed to a sequence of open-loop control actions. Under certain conditions, we establish that our proposed data-driven control method produces identical control inputs as that produced by the associated model-based SMPC. Simulation results on a grid-connected power converter are provided to illustrate the performance benefits of our methodology.
comment: 20 pages, 4 figures. The extended version of a submission to IEEE Transactions on Automatic Control
Robust Adaptive MPC Using Uncertainty Compensation
This paper presents an uncertainty compensation-based robust adaptive model predictive control (MPC) framework for linear systems with both matched and unmatched nonlinear uncertainties subject to both state and input constraints. In particular, the proposed control framework leverages an L1 adaptive controller (L1AC) to compensate for the matched uncertainties and to provide guaranteed uniform bounds on the error between the states and control inputs of the actual system and those of a nominal i.e., uncertainty-free, system. The performance bounds provided by the L1AC are then used to tighten the state and control constraints of the actual system, and a model predictive controller is designed for the nominal system with the tightened constraints. The proposed control framework, which we denote as uncertainty compensation-based MPC (UC-MPC), guarantees constraint satisfaction and achieves improved performance compared with existing methods. Simulation results on a flight control example demonstrate the benefits of the proposed framework.
comment: arXiv admin note: text overlap with arXiv:2208.02985
Applications of Lifted Nonlinear Cuts to Convex Relaxations of the AC Power Flow Equations
We demonstrate that valid inequalities, or lifted nonlinear cuts (LNC), can be projected to tighten the Second Order Cone (SOC), Convex DistFlow (CDF), and Network Flow (NF) relaxations of the AC Optimal Power Flow (AC-OPF) problem. We conduct experiments on 36 cases from the PGLib-OPF library for two objective functions, (1) power generation maximization and (2) generation cost minimization. Significant optimality gap improvements are shown for the maximization problem, where the LNC strengthen the SOC and CDF relaxations in 100% of the test cases, with average and maximum differences in the optimality gaps of 23.1% and 93.5% respectively. The NF relaxation is strengthened in 79.2% of test cases, with average and maximum differences in the optimality gaps of 3.45% and 21.2% respectively. We also study the trade-off between relaxation quality and solve time, demonstrating that the strengthened CDF relaxation outperforms the strengthened SOC formulation in terms of runtime and number of iterations needed, while the strengthened NF formulation is the most scalable with the lowest relaxation quality provided by these LNC.
Probabilistic Metaplasticity for Continual Learning with Memristors
Edge devices operating in dynamic environments critically need the ability to continually learn without catastrophic forgetting. The strict resource constraints in these devices pose a major challenge to achieve this, as continual learning entails memory and computational overhead. Crossbar architectures using memristor devices offer energy efficiency through compute-in-memory and hold promise to address this issue. However, memristors often exhibit low precision and high variability in conductance modulation, rendering them unsuitable for continual learning solutions that require precise modulation of weight magnitude for consolidation. Current approaches fall short to address this challenge directly and rely on auxiliary high-precision memory, leading to frequent memory access, high memory overhead, and energy dissipation. In this research, we propose probabilistic metaplasticity, which consolidates weights by modulating their update probability rather than magnitude. The proposed mechanism eliminates high-precision modification to weight magnitudes and, consequently, the need for auxiliary high-precision memory. We demonstrate the efficacy of the proposed mechanism by integrating probabilistic metaplasticity into a spiking network trained on an error threshold with low-precision memristor weights. Evaluations of continual learning benchmarks show that probabilistic metaplasticity achieves performance equivalent to state-of-the-art continual learning models with high-precision weights while consuming ~ 67% lower memory for additional parameters and up to ~ 60x lower energy during parameter updates compared to an auxiliary memory-based solution. The proposed model shows potential for energy-efficient continual learning with low-precision emerging devices.
Systems and Control (EESS)
Learning with Dynamics: Autonomous Regulation of UAV Based Communication Networks with Dynamic UAV Crew
Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) are a key component in future mobile networking. To handle the dynamic environments in UCNs, reinforcement learning (RL) has been a promising solution attributed to its strong capability of adaptive decision-making free of the environment models. However, most existing RL-based research focus on control strategy design assuming a fixed set of UAVs. Few works have investigated how UCNs should be adaptively regulated when the serving UAVs change dynamically. This article discusses RL-based strategy design for adaptive UCN regulation given a dynamic UAV set, addressing both reactive strategies in general UCNs and proactive strategies in solar-powered UCNs. An overview of the UCN and the RL framework is first provided. Potential research directions with key challenges and possible solutions are then elaborated. Some of our recent works are presented as case studies to inspire innovative ways to handle dynamic UAV crew with different RL algorithms.
comment: 7 pages, 6 figures, magazine paper
Complex-Phase, Data-Driven Identification of Grid-Forming Inverter Dynamics
The increasing integration of renewable energy sources (RESs) into power systems requires the deployment of grid-forming inverters to ensure a stable operation. Accurate modeling of these devices is necessary. In this paper, a system identification approach to obtain low-dimensional models of grid-forming inverters is presented. The proposed approach is based on a Hammerstein-Wiener parametrization of the normal-form model. The normal-form is a gray-box model that utilizes complex frequency and phase to capture non-linear inverter dynamics. The model is validated on two well-known control strategies: droop-control and dispatchable virtual oscillators. Simulations and hardware-in-the-loop experiments demonstrate that the normal-form accurately models inverter dynamics across various operating conditions. The approach shows great potential for enhancing the modeling of RES-dominated power systems, especially when component models are unavailable or computationally expensive.
Towards human-like kinematics in industrial robotic arms: a case study on a UR3 robot
Safety in industrial robotic environments is a hot research topic in the area of human-robot interaction (HRI). Up to now, a robotic arm on an assembly line interacts with other machines away from human workers. Nowadays, robotic arm manufactures are aimed to their robots could increasingly perform tasks collaborating with humans. One of the ways to improve this collaboration is by making the movement of robots more humanlike. This way, it would be easier for a human to foresee the movement of the robot and approach it without fear of contact. The main difference between the movement of a human and of a robotic arm is that the former has a bell-shaped speed profile while the latter has a uniform speed one. To generate this speed profile, the kinematic theory of rapid human movements and its Sigma-Lognormal model has been used. This model is widely used to explain most of the basic phenomena related to the control of human movements. Both human-like and robotic-like movements are transferred to the UR3 robot. In this paper we detail the how the UR3 robot was programmed to produce both kinds of movement. The dissimilarities result between the input motion and output motion to the robot confirm the possibility to develop human-like velocities in the UR3 robot.
comment: 6 pages, 5 figures
Generic Diagonalizability, Structural Functional Observability and Output Controllability
This paper investigates the structural functional observability (SFO) and structural output controllability (SOC) of a class of systems with generically diagonalizable state matrices and explores the associated minimal sensor and actuator placement problems. The verification of SOC and the corresponding sensor and actuator placement problems, i.e., the problems of determining the minimum number of outputs and inputs required to achieve SFO and SOC, respectively, are yet open for general systems, which motivates our focus on a class of systems enabling polynomial-time solutions. In this line, we first define and characterize generically diagonalizable systems, referring to structured systems for which almost all realizations of the state matrices are diagonalizable. We then develop computationally efficient criteria for SFO and SOC within the context of generically diagonalizable systems. Our work expands the class of systems amenable to polynomial-time SOC verification. Thanks to the simplicity of the obtained criteria, we derive closed-form solutions for determining the minimal sensor placement to achieve SFO and the minimal actuator deployment to achieve SOC in such systems, along with efficient weighted maximum matching based and weighted maximum flow based algorithms. For more general systems to achieve SFO, an upper bound is given by identifying a non-decreasing property of SFO with respect to a specific class of edge additions, which is shown to be optimal under certain circumstances.
comment: Under review in a Journal
Energy efficiency analysis as a function of the working voltages in supercapacitors
Supercapacitors are increasingly used as energy storage elements. Unlike batteries, their state of charge has a considerable influence on their voltage in normal operation, allowing them to work from zero to their maximum voltage. In this work, a theoretical and practical analysis is proposed of the energy efficiency of these devices according to their working voltages. To this end, several supercapacitors were subjected to charge and discharge cycles until the measurements of current and voltage stabilized. At this point their energy efficiency was calculated. These charge-discharge cycles were carried out: i) without rest between charging and discharging; and ii) with a rest of several minutes between the two stages. Using the information obtained from the tests, the energy efficiency is shown plotted against the minimum and maximum working voltages. By consulting the data and the graphs, the ideal working voltages to optimize the energy efficiency of these devices can be obtained.
comment: 18 pages, 10 figures
A Novel MOSFET based Single Event Latchup Detection, Current Limiting & Self Power Cycling circuit for Spacecraft systems
Single Event Latch-up (SEL) is one of the prime concerns for CMOS ICs used in space systems. Galactic Cosmic Rays or Solar Energetic Particles (SEP) may trigger the parasitic latch up circuit in CMOS ICs and cause increase in current beyond the safe limits thereby presenting a threat of permanent failure of the IC. Mitigation of the SEL is always a challenging task. The conventional mitigation approaches inherently introduce some response time which presents an uncertainty because during this response time the current may exceed the safe current limits. This paper presents a novel circuit based on MOSFETs which provides end-to-end complete solution of detecting SEL, limiting the current below the set threshold and executing power cycling to restore the normal functioning of the CMOS IC. The proposed circuit has been simulated in MULTISIM and the simulation results match very well with the expected behavior of (i)current limiting and (ii) the total time duration taken in power cycling to bring the SEL sensitive device back to its normal operational state. This circuit can be harnessed by spacecraft system designers to overcome the catastrophic threat of SEL posed by space radiation environment.
The Power-Oriented Graphs Modeling Technique: From the Fundamental Principles to the Systematic, Step-by-Step Modeling of Complex Physical Systems
Modeling physical systems is an essential skill for a control engineer, since it enables to achieve a deep understanding of their dynamic behavior and, consequently, the development of effective control strategies. The first part of this article provides a tutorial description of the fundamental principles and properties of the Power-Oriented Graphs (POG) modeling technique. Various case studies in different energetic domains are then presented to consolidate the fundamental principles, each highlighting different features of the POG modeling technique. The latter is then compared with the other two main graphical modeling techniques available in the literature, namely Bond Graph (BG) and Energetic Macroscopic Representation (EMR). The second part of this article assumes once again a tutorial nature, in order to introduce the new Fast Modeling POG (FMPOG) procedure. The FMPOG, which operates in the POG framework, is a methodical step-by-step procedure that enables the readers to quickly derive the power-oriented graphical model of physical systems starting from their schematics. From the power-oriented graphical model, the state-space model can then be directly determined. To ensure the FMPOG procedure is easily usable by the entire community, we apply it to three examples in different energetic domains in this article, guiding the reader step-by-step through the derivation of the physical systems models.
Feedforward Controllers from Learned Dynamic Local Model Networks with Application to Excavator Assistance Functions
Complicated first principles modelling and controller synthesis can be prohibitively slow and expensive for high-mix, low-volume products such as hydraulic excavators. Instead, in a data-driven approach, recorded trajectories from the real system can be used to train local model networks (LMNs), for which feedforward controllers are derived via feedback linearization. However, previous works required LMNs without zero dynamics for feedback linearization, which restricts the model structure and thus modelling capacity of LMNs. In this paper, we overcome this restriction by providing a criterion for when feedback linearization of LMNs with zero dynamics yields a valid controller. As a criterion we propose the bounded-input bounded-output stability of the resulting controller. In two additional contributions, we extend this approach to consider measured disturbance signals and multiple inputs and outputs. We illustrate the effectiveness of our contributions in a hydraulic excavator control application with hardware experiments. To this end, we train LMNs from recorded, noisy data and derive feedforward controllers used as part of a leveling assistance system on the excavator. In our experiments, incorporating disturbance signals and multiple inputs and outputs enhances tracking performance of the learned controller. A video of our experiments is available at https://youtu.be/lrrWBx2ASaE.
Measurements and System Identification for the Characterization of Smooth Muscle Cell Dynamics
Biological tissue integrity is actively maintained by cells. It is essential to comprehend how cells accomplish this in order to stage tissue diseases. However, addressing the complexity of a cell's system of interrelated mechanisms poses a challenge. This necessitates a well-structured identification framework and an effective integration of measurements. Here we introduce the use of state-of-the-art frequency-domain system identification techniques combined with an indentation measurement platform to analyze the underlying mechanisms from the perspective of control system theory. The ultimate goal is to explore how mechanical and biological factors are related in induced Pluripotent Stem Cell-derived vascular smooth muscle cells. We study on the frequency-domain analysis for the investigation and characterization of cellular dynamics of smooth muscle cells from the measured data. The measurement model in this study exploits the availability of human tissue and samples, enabling fundamental investigations of vascular tissue disease. This approach using human cell lines holds significant potential to decrease the necessity for animal-based safety and efficacy studies. The focus of this review is to investigate the cellular dynamics underlying the myogenic response and to demonstrate the practicability of employing a nano-indentation measurement setup for the broadband frequency-domain characterization of induced Pluripotent Stem Cell-derived vascular smooth muscle cells.
comment: 6 pages, 9 figures, presented in the Medical Measurements and Applications - MeMeA2024 conference
Performance Boundary Analyses for Statistical Multi-QoS Framework Over 6G SAGINs
To enable the cost-effective universal access and the enhancement of current communication services, the space-air-ground integrated networks (SAGINs) have recently been developed due to its exceptional 3D coverage and the ability to guarantee rigorous and multidimensional demands for quality-of-service (QoS) provisioning, including delay and reliability across vast distances. In response to the complex, heterogeneous, and dynamic serving scenarios and stringent performance expectations for 6G SAGINs, it is crucial to undertake modeling, assurance, and analysis of the key technologies, aligned with the diverse demands for QoS provisioning in the non-asymptotic regime, i.e., when implementing finite blocklength coding (FBC) as a new dimension for error-rate bounded QoS metric. However, how to design new statistical QoS-driven performance modeling approaches that accurately delineate the complex and dynamic behaviors of networks, particularly in terms of constraining both delay and error rate, persists as a significant challenge for implementing mURLLC within 6G SAGINs in the finite blocklength regime. To overcome these difficulties, in this paper we propose to develop a set of analytical modeling frameworks for 6G SAGIN in supporting statistical delay and error-rate bounded QoS in the finite blocklength regime. First we establish the SAGIN system architecture model. Second, the aggregate interference and decoding error probability functions are modeled and examined through using Laplace transform. Third, we introduce modeling techniques aimed at defining the$\epsilon$-effective capacity function as a crucial metric for facilitating statistical QoS standards with respect to delay and error-rate. To validate the effectiveness of the developed performance modeling schemes, we have executed a series of simulations over SAGINs.
Inline Photometrically Calibrated Hybrid Visual SLAM
This paper presents an integrated approach to Visual SLAM, merging online sequential photometric calibration within a Hybrid direct-indirect visual SLAM (H-SLAM). Photometric calibration helps normalize pixel intensity values under different lighting conditions, and thereby improves the direct component of our H-SLAM. A tangential benefit also results to the indirect component of H-SLAM given that the detected features are more stable across variable lighting conditions. Our proposed photometrically calibrated H-SLAM is tested on several datasets, including the TUM monoVO as well as on a dataset we created. Calibrated H-SLAM outperforms other state of the art direct, indirect, and hybrid Visual SLAM systems in all the experiments. Furthermore, in online SLAM tested at our site, it also significantly outperformed the other SLAM Systems.
Distributed Robust Optimization Method for AC/MTDC Hybrid Power Systems with DC Network Cognizance
AC/multi-terminal DC (MTDC) hybrid power systems have emerged as a solution for the large-scale and longdistance accommodation of power produced by renewable energy systems (RESs). To ensure the optimal operation of such hybrid power systems, this paper addresses three key issues: system operational flexibility, centralized communication limitations, and RES uncertainties. Accordingly, a specific AC/DC optimal power flow (OPF) model and a distributed robust optimization method are proposed. Firstly, we apply a set of linear approximation and convex relaxation techniques to formulate the mixed-integer convex AC/DC OPF model. This model incorporates the DC network-cognizant constraint and enables DC topology reconfiguration. Next, generalized Benders decomposition (GBD) is employed to provide distributed optimization. Enhanced approaches are incorporated into GBD to achieve parallel computation and asynchronous updating. Additionally, the extreme scenario method (ESM) is embedded into the AC/DC OPF model to provide robust decisions to hedge against RES uncertainties. ESM is further extended to align the GBD procedure. Numerical results are finally presented to validate the effectiveness of our proposed method.
Adaptive Single-Terminal Fault Location for DC Microgrids
Identifying faulty lines and their accurate location is key for rapidly restoring distribution systems. This will become a greater challenge as the penetration of power electronics increases, and contingencies are seen across larger areas. This paper proposes a single terminal methodology (i.e., no communication involved) that is robust to variations of key parameters (e.g., sampling frequency, system parameters, etc.) and performs particularly well for low resistance faults that constitute the majority of faults in low voltage DC systems. The proposed method uses local measurements to estimate the current caused by the other terminals affected by the contingency. This mimics the strategy followed by double terminal methods that require communications and decouples the accuracy of the methodology from the fault resistance. The algorithm takes consecutive voltage and current samples, including the estimated current of the other terminal, into the analysis. This mathematical methodology results in a better accuracy than other single-terminal approaches found in the literature. The robustness of the proposed strategy against different fault resistances and locations is demonstrated using MATLAB simulations.
comment: SEST 2024
Event-Triggered Non-Linear Control of Offshore MMC Grids for Asymmetrical AC Faults
Fault ride-through capability studies of MMC-HVDC connected wind power plants have focused primarily on the DC link and onshore AC grid faults. Offshore AC faults, mainly asymmetrical faults have not gained much attention in the literature despite being included in the future development at national levels in the ENTSO-E HVDC code. The proposed work gives an event-triggered control to stabilize the system once the offshore AC fault has occurred, identified, and isolated. Different types of control actions such as proportional-integral (PI) controller and super-twisted sliding mode control (STSMC) are used to smoothly transition the post-fault system to a new steady state operating point by suppressing the negative sequence control. Initially, the effect of a negative sequence current control scheme on the transient behavior of the power system with a PI controller is discussed in this paper. Further, a non-linear control strategy (STSMC) is proposed which gives quicker convergence of the system post-fault in comparison to PI control action. These post-fault control operations are only triggered in the presence of a fault in the system, i.e., they are event-triggered. The validity of the proposed strategy is demonstrated by simulation on a $\pm$525 kV, three-terminal meshed MMC-HVDC system model in Real Time Digital Simulator (RTDS).
The Bayesian Separation Principle for Data-driven Control
This paper investigates the existence of a separation principle between model identification and control design in the context of model predictive control. First, we elucidate that the separation principle holds asymptotically in the number of data in a Fisherian setting, and universally in a Bayesian setting. Then, by formulating model predictive control within a Gaussian regression framework, we describe how the Bayesian separation principle can be used to derive explicit, uncertainty-aware expressions for the control cost and optimal input sequence, thereby bridging direct and indirect data-driven approaches.
comment: 13 pages, 1 figure
Stochastic Shortest Path Problem with Failure Probability
We solve a sequential decision-making problem under uncertainty that takes into account the failure probability of a task. This problem cannot be handled by the stochastic shortest path problem, which is the standard model for sequential decision-making. This problem is addressed by introducing dead-ends. Conventionally, we only consider policies that minimize the probability of task failure, so the optimal policy constructed could be overly conservative. In this paper, we address this issue by expanding the search range to a class of policies whose failure probability is less than a desired threshold. This problem can be solved by treating it as a framework of a Bayesian Markov decision process and a two-person zero-sum game. Also, it can be seen that the optimal policy is expressed in the form of a probability distribution on a set of deterministic policies. We also demonstrate the effectiveness of the proposed methods by applying them to a motion planning problem with obstacle avoidance for a moving robot.
comment: 22 pages, 5 figure
Multirotor Nonlinear Model Predictive Control based on Visual Servoing of Evolving Features
This article presents a Visual Servoing Nonlinear Model Predictive Control (NMPC) scheme for autonomously tracking a moving target using multirotor Unmanned Aerial Vehicles (UAVs). The scheme is developed for surveillance and tracking of contour-based areas with evolving features. NMPC is used to manage input and state constraints, while additional barrier functions are incorporated in order to ensure system safety and optimal performance. The proposed control scheme is designed based on the extraction and implementation of the full dynamic model of the features describing the target and the state variables. Real-time simulations and experiments using a quadrotor UAV equipped with a camera demonstrate the effectiveness of the proposed strategy.
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models ICRA 2025
We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data. During end-to-end training, our policy learns how to recover from errors by aligning with states observed in human demonstrations, so that at runtime it can recover from perturbations outside the training distribution. Additionally, we introduce a novel transformer-based perception encoder that employs multi-view cross-attention and a learned scene query. We present qualitative and quantitative results, demonstrating significant improvements upon prior state of the art in closed-loop testing in the CARLA simulator, as well as showing the ability to handle perturbations in both CARLA and NVIDIA's DRIVE Sim.
comment: 7 pages, 6 figures, for ICRA 2025 conference, for associated video file, see https://youtu.be/9FpDFD9aiFU
A Fast Dynamic Internal Predictive Power Scheduling Approach for Power Management in Microgrids
This paper presents a Dynamic Internal Predictive Power Scheduling (DIPPS) approach for optimizing power management in microgrids, particularly focusingon external power exchanges among diverse prosumers. DIPPS utilizes a dynamic objective function with a time-varying binary parameter to control the timing of power transfers to the external grid, facilitated by efficient usage of energy storage for surplus renewable power. The microgrid power scheduling problem is modeled as a mixed-integer nonlinear programmig (MINLP-PS) and subsequently transformed into a mixed-integer linear programming (MILP-PS) optimization through McCormick's relaxation to reduce the computational complexity. A predictive window with 6 data points is solved at an average of 0.92s, a 97.6% improvement over the 38.27s required for the MINLP-PS formulation, implying the numerical feasibility of the DIPPS approach for real-time implementation. Finally, the approach is validated against a static objective using real-world load data across three case studies with different time-varying parameters, demonstrationg the ability of DIPPS to optimize power exchanges and efficiently utilize distributed resources whie shifting the eexternal power transfers to specified time durations.
Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots
Mobile smartphones compactly provide sensors such as cameras, IMUs, GNSS measurement units, and wireless and wired communication channels required for robotics projects. They are affordable, portable, and programmable, which makes them ideal for testing, data acquisition, controlling mobile robots, and many other robotic applications. A robotic system is proposed in this paper, consisting of an Android phone, a microcontroller board attached to the phone via USB, and a remote wireless controller station. In the data acquisition mode, the Android device can record a dataset of a diverse configuration of multiple cameras, IMUs, GNSS units, and external USB ADC channels in the rawest format used for, but not limited to, pose estimation and scene reconstruction applications. In robot control mode, the Android phone, a microcontroller board, and other peripherals constitute the mobile or stationary robotic system. This system is controlled using a remote server connected over Wi-Fi or Bluetooth. Experiments show that although the SLAM and AR applications can utilize the acquired data, the proposed system can pave the way for more advanced algorithms for processing these noisy and sporadic measurements. Moreover, the characteristics of the communication media are studied, and two example robotic projects, which involve controlling a toy car and a quadcopter, are included.
comment: Project repository: https://github.com/m-dayani/robo-platform Youtube Video: https://youtu.be/BTQ4yLB1bak Dataset: https://drive.google.com/drive/folders/1OZqdA1xa-SyJ64qL_TibqhtwhR1fWWrx?usp=sharing
$\mathcal{L}_{1}$ Adaptive Optimizer for Uncertain Time-Varying Convex Optimization
We propose an adaptive method for uncertain time-varying (TV) convex optimization, termed as $\mathcal{L}_{1}$ adaptive optimization ($\mathcal{L}_{1}$-AO). The proposed method uses a baseline TV optimizer with a prediction model, designed for the gradient dynamics to exploit the underlying structure of the temporal correlation. Inspired by $\mathcal{L}_{1}$ adaptive control, the proposed method augments an adaptive update law to estimate and compensate for the uncertainty from the inaccurate prediction in the online implementation. The proposed method provides the performance bounds of the error in the optimization variables and cost function, allowing efficient and reliable optimization for uncertain TV problems.
comment: 8 pages, 3 figures
Device for detection of activity-dependent changes in neural spheroids at MHz and GHz frequencies
Intracellular processes triggered by neural activity include changes in ionic concentrations, protein release, and synaptic vesicle cycling. These processes play significant roles in neurological disorders. The beneficial effects of brain stimulation may also be mediated through intracellular changes. There is a lack of label-free techniques for monitoring activity-dependent intracellular changes. Electromagnetic (EM) waves at frequencies larger than 1x10^6 Hz (1 MHz) were previously used to probe intracellular contents of cells, as cell membrane becomes transparent at this frequency range. EM waves interact with membranes of intracellular organelles, proteins, and water in the MHz-GHz range. In this work, we developed a device for probing the interaction between intracellular contents of active neurons and EM waves. The device used an array of grounded coplanar waveguides (GCPWs) to deliver EM waves to a three-dimensional (3D) spheroid of rat cortical neurons. Neural activity was evoked using optogenetics, with synchronous detection of propagation of EM waves. Broadband measurements were conducted in the MHz-GHz range to track changes in transmission coefficients. Neuronal activity was found to reversibly alter EM wave transmission. Pharmacological suppression of neuronal activity abolished changes in transmission. Time constants of changes in transmission were in the range of seconds to tens of seconds, suggesting the presence of relatively slow, activity-dependent intracellular processes. This study provides the first evidence that EM transmission through neuronal tissue is activity-dependent in MHz-GHz range. Device developed in this work may find future applications in studies of the mechanisms of neurological disorders and the development of new therapies.
On the Interplay of Clustering and Evolution in the Emergence of Epidemic Outbreaks
In an increasingly interconnected world, a key scientific challenge is to examine mechanisms that lead to the widespread propagation of contagions, such as misinformation and pathogens, and identify risk factors that can trigger large-scale outbreaks. Underlying both the spread of disease and misinformation epidemics is the evolution of the contagion as it propagates, leading to the emergence of different strains, e.g., through genetic mutations in pathogens and alterations in the information content. Recent studies have revealed that models that do not account for heterogeneity in transmission risks associated with different strains of the circulating contagion can lead to inaccurate predictions. However, existing results on multi-strain spreading assume that the network has a vanishingly small clustering coefficient, whereas clustering is widely known to be a fundamental property of real-world social networks. In this work, we investigate spreading processes that entail evolutionary adaptations on random graphs with tunable clustering and arbitrary degree distributions. We derive a mathematical framework to quantify the epidemic characteristics of a contagion that evolves as it spreads, with the structure of the underlying network as given via arbitrary {\em joint} degree distributions of single-edges and triangles. To the best of our knowledge, our work is the first to jointly analyze the impact of clustering and evolution on the emergence of epidemic outbreaks. We supplement our theoretical finding with numerical simulations and case studies, shedding light on the impact of clustering on contagion spread.
Precision Aquaculture: An Integrated Computer Vision and IoT Approach for Optimized Tilapia Feeding
Traditional fish farming practices often lead to inefficient feeding, resulting in environmental issues and reduced productivity. We developed an innovative system combining computer vision and IoT technologies for precise Tilapia feeding. Our solution uses real-time IoT sensors to monitor water quality parameters and computer vision algorithms to analyze fish size and count, determining optimal feed amounts. A mobile app enables remote monitoring and control. We utilized YOLOv8 for keypoint detection to measure Tilapia weight from length, achieving \textbf{94\%} precision on 3,500 annotated images. Pixel-based measurements were converted to centimeters using depth estimation for accurate feeding calculations. Our method, with data collection mirroring inference conditions, significantly improved results. Preliminary estimates suggest this approach could increase production up to 58 times compared to traditional farms. Our models, code, and dataset are open-source~\footnote{The code, dataset, and models are available upon reasonable request.
comment: 8 pages, 6 figures, 3 tables, 21th International Conference on Informatics in Control, Automation, and Robotics
Sampling-based Stochastic Data-driven Predictive Control under Data Uncertainty
We present a stochastic constrained output-feedback data-driven predictive control scheme for linear time-invariant systems subject to bounded additive disturbances. The approach uses data-driven predictors based on an extension of Willems' fundamental lemma and requires only a single persistently exciting input-output data trajectory. Compared to current state-of-the-art approaches, we do not rely on availability of exact disturbance data. Instead, we leverage a novel parameterization of the unknown disturbance data considering consistency with the measured data and the system class. This allows for deterministic approximation of the chance constraints in a sampling-based fashion. A robust constraint on the first predicted step enables recursive feasibility, closed-loop constraint satisfaction, and robust asymptotic stability in expectation under standard assumptions. A numerical example demonstrates the efficiency of the proposed control scheme.
GPU-Accelerated DCOPF using Gradient-Based Optimization
DC Optimal Power Flow (DCOPF) is a key operational tool for power system operators, and it is embedded as a subproblem in many challenging optimization problems (e.g., line switching). However, traditional CPU-based solve routines (e.g., simplex) have saturated in speed and are hard to parallelize. This paper focuses on solving DCOPF problems using gradient-based routines on Graphics Processing Units (GPUs), which have massive parallelization capability. To formulate these problems, we pose a Lagrange dual associated with DCOPF (linear and quadratic cost curves), and then we explicitly solve the inner (primal) minimization problem with a dual norm. The resulting dual problem can be efficiently iterated using projected gradient ascent. After solving the dual problem on both CPUs and GPUs to find tight lower bounds, we benchmark against Gurobi and MOSEK, comparing convergence speed and tightness on the IEEE 2000, 4601, and 10000 bus systems. We provide reliable and tight lower bounds for these problems with, at best, 5.4x speedup over a conventional solver.
Identification of Additive Continuous-time Systems in Open and Closed loop
When identifying electrical, mechanical, or biological systems, parametric continuous-time identification methods can lead to interpretable and parsimonious models when the model structure aligns with the physical properties of the system. Traditional linear system identification may not consider the most parsimonious model when relying solely on unfactored transfer functions, which typically result from standard direct approaches. This paper presents a novel identification method that delivers additive models for both open and closed-loop setups. The estimators that are derived are shown to be generically consistent, and can admit the identification of marginally stable additive systems. Numerical simulations show the efficacy of the proposed approach, and its performance in identifying a modal representation of a flexible beam is verified using experimental data.
comment: 15 pages, 6 figures
Instantaneous Frequency Estimation in Unbalanced Systems Using Affine Differential Geometry
The paper discusses the relationships between electrical and affine differential geometry quantities, establishing a link between frequency and time derivatives of voltage, through the utilization of affine geometric invariants. Based on this link, a new instantaneous frequency estimation formula is proposed, which is particularly suited for unbalanced and single-phase systems. Several examples as well as measurements based on two real-world events illustrate the findings of the paper.
Proactive Emergency Collision Avoidance for Automated Driving in Highway Scenarios
Uncertainty in the behavior of other traffic participants is a crucial factor in collision avoidance for automated driving; here, stochastic metrics could avoid overly conservative decisions. This paper introduces a Stochastic Model Predictive Control (SMPC) planner for emergency collision avoidance in highway scenarios to proactively minimize collision risk while ensuring safety through chance constraints. To guarantee that the emergency trajectory can be attained, we incorporate nonlinear tire dynamics in the prediction model of the ego vehicle. Further, we exploit Max-Min-Plus-Scaling (MMPS) approximations of the nonlinearities to avoid conservatism, enforce proactive collision avoidance, and improve computational efficiency in terms of performance and speed. Consequently, our contributions include integrating a dynamic ego vehicle model into the SMPC planner, introducing the MMPS approximation for real-time implementation in emergency scenarios, and integrating SMPC with hybridized chance constraints and risk minimization. We evaluate our SMPC formulation in terms of proactivity and efficiency in various hazardous scenarios. Moreover, we demonstrate the effectiveness of our proposed approach by comparing it with a state-of-the-art SMPC planner and we validate that the generated trajectories can be attained using a high-fidelity vehicle model in IPG CarMaker.
comment: 14 pages, 11 figures, submitted to IEEE Transactions on Control Systems Technology
Mamba as a motion encoder for robotic imitation learning
Recent advancements in imitation learning, particularly with the integration of LLM techniques, are set to significantly improve robots' dexterity and adaptability. This paper proposes using Mamba, a state-of-the-art architecture with potential applications in LLMs, for robotic imitation learning, highlighting its ability to function as an encoder that effectively captures contextual information. By reducing the dimensionality of the state space, Mamba operates similarly to an autoencoder. It effectively compresses the sequential information into state variables while preserving the essential temporal dynamics necessary for accurate motion prediction. Experimental results in tasks such as cup placing and case loading demonstrate that despite exhibiting higher estimation errors, Mamba achieves superior success rates compared to Transformers in practical task execution. This performance is attributed to Mamba's structure, which encompasses the state space model. Additionally, the study investigates Mamba's capacity to serve as a real-time motion generator with a limited amount of training data.
comment: 8 pages, 9 figures
Model-Free Generic Robust Control for Servo-Driven Actuation Mechanisms with Layered Insight into Energy Conversions
To advance theoretical solutions and address limitations in modeling complex servo-driven actuation systems experiencing high non-linearity and load disturbances, this paper aims to design a practical model-free generic robust control (GRC) framework for these mechanisms. This framework is intended to be applicable across all actuator systems encompassing electrical, hydraulic, or pneumatic servomechanisms, while also functioning within complex interactions among dynamic components and adhering to control input constraints. In this respect, the state-space model of actuator systems is decomposed into smaller subsystems that incorporate the first principle equation of actuator motion dynamics and interactive energy conversion equations. This decomposition operates under the assumption that the comprehensive model of the servo-driven actuator system and energy conversion, uncertainties, load disturbances, and their bounds are unknown. Then, the GRC employs subsystem-based adaptive control strategies for each state-variant subsystem separately. Despite control input constraints and the unknown interactive system model, the GRC-applied actuator mechanism ensures uniform exponential stability and robustness in tracking desired motions. It features straightforward implementation, experimentally evaluated by applying it to two industrial applications.
comment: This work has been submitted for possible publication in the IEEE
SIMBa: System Identification Methods leveraging Backpropagation
This manuscript details and extends the SIMBa toolbox (System Identification Methods leveraging Backpropagation) presented in previous work, which uses well-established Machine Learning tools for discrete-time linear multi-step-ahead state-space System Identification (SI). SIMBa leverages linear-matrix-inequality-based free parametrizations of Schur matrices to guarantee the stability of the identified model by design. In this paper, backed up by novel free parametrizations of Schur matrices, we extend the toolbox to show how SIMBa can incorporate known sparsity patterns or true values of the state-space matrices to identify without jeopardizing stability. We extensively investigate SIMBa's behavior when identifying diverse systems with various properties from both simulated and real-world data. Overall, we find it consistently outperforms traditional stable subspace identification methods, and sometimes significantly, especially when enforcing desired model properties. These results hint at the potential of SIMBa to pave the way for generic structured nonlinear SI. The toolbox is open-sourced on https://github.com/Cemempamoi/simba.
comment: First two authors contributed equally. Submitted to IEEE TCST
An Alternative to Multi-Factor Authentication with a Triple-Identity Authentication Scheme
The existing authentication system has two entry points (i.e., username and password fields) to interact with the outside, but neither of them has a gatekeeper, making the system vulnerable to cyberattacks. In order to ensure the authentication security, the system sets a third entry point and use an external MFA service to guard it. The crux of the problem is that the system has no internal mechanism to guard its own entry points as no identifiers can be defined for the username and password without using any personal information. To solve this problem, we open the hash algorithm of a dual-password login-authentication system to three login credentials. Therefore, the intermediate elements of the algorithm can be used to define an identifier to verify the user identity at each entry point of the system. As a result of the above setup, a triple-identity authentication is established, the key of which is that the readily available user's login name and password are randomly converted into a matrix of meaningless hash elements which are concealed, incommunicable, inaccessible, and independent of personal information. So the identifiers defined using such elements can be used by the system to verify the identities of the user at all the entry points of the system, thereby ensuring the authentication security without relying on MFA services.
comment: 5 pages, 2 figures, 11 conferences
Towards Autonomous Supply Chains: Definition, Characteristics, Conceptual Framework, and Autonomy Levels
Recent global disruptions, such as the pandemic and geopolitical conflicts, have profoundly exposed vulnerabilities in traditional supply chains, requiring exploration of more resilient alternatives. Autonomous supply chains (ASCs) have emerged as a potential solution, offering increased visibility, flexibility, and resilience in turbulent trade environments. Despite discussions in industry and academia over several years, ASCs lack well-established theoretical foundations. This paper addresses this research gap by presenting a formal definition of ASC along with its defining characteristics and auxiliary concepts. We propose a layered conceptual framework called the MIISI model. An illustrative case study focusing on the meat supply chain demonstrates an initial ASC implementation based on this conceptual model. Additionally, we introduce a seven-level supply chain autonomy reference model, delineating a trajectory towards achieving a full supply chain autonomy. Recognising that this work represents an initial endeavour, we emphasise the need for continued exploration in this emerging domain. We anticipate that this work will stimulate further research, both theoretical and technical, and contribute to the continual evolution of ASCs.
comment: This paper includes 19 pages and 8 figures and has been accepted for publication in the Journal of Industrial Information Integration
Stochastic Data-Driven Predictive Control with Equivalence to Stochastic MPC
We propose a data-driven receding-horizon control method dealing with the chance-constrained output-tracking problem of unknown stochastic linear time-invariant (LTI) systems with partial state observation. The proposed method takes into account the statistics of the process noise, the measurement noise and the uncertain initial condition, following an analogous framework to Stochastic Model Predictive Control (SMPC), but does not rely on the use of a parametric system model. As such, our receding-horizon algorithm produces a sequence of closed-loop control policies for predicted time steps, as opposed to a sequence of open-loop control actions. Under certain conditions, we establish that our proposed data-driven control method produces identical control inputs as that produced by the associated model-based SMPC. Simulation results on a grid-connected power converter are provided to illustrate the performance benefits of our methodology.
comment: 20 pages, 4 figures. The extended version of a submission to IEEE Transactions on Automatic Control
Robust Adaptive MPC Using Uncertainty Compensation
This paper presents an uncertainty compensation-based robust adaptive model predictive control (MPC) framework for linear systems with both matched and unmatched nonlinear uncertainties subject to both state and input constraints. In particular, the proposed control framework leverages an L1 adaptive controller (L1AC) to compensate for the matched uncertainties and to provide guaranteed uniform bounds on the error between the states and control inputs of the actual system and those of a nominal i.e., uncertainty-free, system. The performance bounds provided by the L1AC are then used to tighten the state and control constraints of the actual system, and a model predictive controller is designed for the nominal system with the tightened constraints. The proposed control framework, which we denote as uncertainty compensation-based MPC (UC-MPC), guarantees constraint satisfaction and achieves improved performance compared with existing methods. Simulation results on a flight control example demonstrate the benefits of the proposed framework.
comment: arXiv admin note: text overlap with arXiv:2208.02985
Applications of Lifted Nonlinear Cuts to Convex Relaxations of the AC Power Flow Equations
We demonstrate that valid inequalities, or lifted nonlinear cuts (LNC), can be projected to tighten the Second Order Cone (SOC), Convex DistFlow (CDF), and Network Flow (NF) relaxations of the AC Optimal Power Flow (AC-OPF) problem. We conduct experiments on 36 cases from the PGLib-OPF library for two objective functions, (1) power generation maximization and (2) generation cost minimization. Significant optimality gap improvements are shown for the maximization problem, where the LNC strengthen the SOC and CDF relaxations in 100% of the test cases, with average and maximum differences in the optimality gaps of 23.1% and 93.5% respectively. The NF relaxation is strengthened in 79.2% of test cases, with average and maximum differences in the optimality gaps of 3.45% and 21.2% respectively. We also study the trade-off between relaxation quality and solve time, demonstrating that the strengthened CDF relaxation outperforms the strengthened SOC formulation in terms of runtime and number of iterations needed, while the strengthened NF formulation is the most scalable with the lowest relaxation quality provided by these LNC.
Probabilistic Metaplasticity for Continual Learning with Memristors
Edge devices operating in dynamic environments critically need the ability to continually learn without catastrophic forgetting. The strict resource constraints in these devices pose a major challenge to achieve this, as continual learning entails memory and computational overhead. Crossbar architectures using memristor devices offer energy efficiency through compute-in-memory and hold promise to address this issue. However, memristors often exhibit low precision and high variability in conductance modulation, rendering them unsuitable for continual learning solutions that require precise modulation of weight magnitude for consolidation. Current approaches fall short to address this challenge directly and rely on auxiliary high-precision memory, leading to frequent memory access, high memory overhead, and energy dissipation. In this research, we propose probabilistic metaplasticity, which consolidates weights by modulating their update probability rather than magnitude. The proposed mechanism eliminates high-precision modification to weight magnitudes and, consequently, the need for auxiliary high-precision memory. We demonstrate the efficacy of the proposed mechanism by integrating probabilistic metaplasticity into a spiking network trained on an error threshold with low-precision memristor weights. Evaluations of continual learning benchmarks show that probabilistic metaplasticity achieves performance equivalent to state-of-the-art continual learning models with high-precision weights while consuming ~ 67% lower memory for additional parameters and up to ~ 60x lower energy during parameter updates compared to an auxiliary memory-based solution. The proposed model shows potential for energy-efficient continual learning with low-precision emerging devices.
Multiagent Systems
Introducing Anisotropic Fields for Enhanced Diversity in Crowd Simulation
Large crowds exhibit intricate behaviors and significant emergent properties, yet existing crowd simulation systems often lack behavioral diversity, resulting in homogeneous simulation outcomes. To address this limitation, we propose incorporating anisotropic fields (AFs) as a fundamental structure for depicting the uncertainty in crowd movement. By leveraging AFs, our method can rapidly generate crowd simulations with intricate behavioral patterns that better reflect the inherent complexity of real crowds. The AFs are generated either through intuitive sketching or extracted from real crowd videos, enabling flexible and efficient crowd simulation systems. We demonstrate the effectiveness of our approach through several representative scenarios, showcasing a significant improvement in behavioral diversity compared to classical methods. Our findings indicate that by incorporating AFs, crowd simulation systems can achieve a much higher similarity to real-world crowd systems. Our code is publicly available at https://github.com/tomblack2014/AF\_Generation.
comment: 25 pages, 12 figures
Opponent Shaping for Antibody Development
Anti-viral therapies are typically designed to target the current strains of a virus. Game theoretically, this corresponds to a short-sighted, or myopic, response. However, therapy-induced selective pressures act on viral antigens to drive the emergence of mutated strains, against which initial therapies have reduced efficacy. Building on a computational model of binding between antibodies and viral antigens (the Absolut! framework), we design and implement a genetic simulation of such viral evolutionary escape. Crucially, this allows our antibody optimisation algorithm to consider and influence the entire escape curve of the virus, i.e. to guide (or ''shape'') the viral evolution. This is inspired by opponent shaping which, in general-sum learning, accounts for the adaptation of the co-player rather than playing a myopic best response. Hence we call the optimised antibodies shapers. Within our simulations, we demonstrate that our shapers target both current and simulated future viral variants, outperforming the antibodies chosen in a myopic way. Furthermore, we show that shapers exert specific evolutionary pressure on the virus compared to myopic antibodies. Altogether, shapers modify the evolutionary trajectories of viral strains and minimise the viral escape compared to their myopic counterparts. While this is a simplified model, we hope that our proposed paradigm will enable the discovery of better long-lived vaccines and antibody therapies in the future, enabled by rapid advancements in the capabilities of simulation tools. Our code is available at https://github.com/olakalisz/antibody-shapers.
comment: Preprint
Cooperative Resilience in Artificial Intelligence Multiagent Systems
Resilience refers to the ability of systems to withstand, adapt to, and recover from disruptive events. While studies on resilience have attracted significant attention across various research domains, the precise definition of this concept within the field of cooperative artificial intelligence remains unclear. This paper addresses this gap by proposing a clear definition of `cooperative resilience' and outlining a methodology for its quantitative measurement. The methodology is validated in an environment with RL-based and LLM-augmented autonomous agents, subjected to environmental changes and the introduction of agents with unsustainable behaviors. These events are parameterized to create various scenarios for measuring cooperative resilience. The results highlight the crucial role of resilience metrics in analyzing how the collective system prepares for, resists, recovers from, sustains well-being, and transforms in the face of disruptions. These findings provide foundational insights into the definition, measurement, and preliminary analysis of cooperative resilience, offering significant implications for the broader field of AI. Moreover, the methodology and metrics developed here can be adapted to a wide range of AI applications, enhancing the reliability and effectiveness of AI in dynamic and unpredictable environments.
comment: Supplementary material in https://github.com/mavivi95/resilience/blob/main/Supplementary_File.pdf
On Collaboration in Distributed Parameter Estimation with Resource Constraints
Effective resource allocation in sensor networks, IoT systems, and distributed computing is essential for applications such as environmental monitoring, surveillance, and smart infrastructure. Sensors or agents must optimize their resource allocation to maximize the accuracy of parameter estimation. In this work, we consider a group of sensors or agents, each sampling from a different variable of a multivariate Gaussian distribution and having a different estimation objective. We formulate a sensor or agent's data collection and collaboration policy design problem as a Fisher information maximization (or Cramer-Rao bound minimization) problem. This formulation captures a novel trade-off in energy use, between locally collecting univariate samples and collaborating to produce multivariate samples. When knowledge of the correlation between variables is available, we analytically identify two cases: (1) where the optimal data collection policy entails investing resources to transfer information for collaborative sampling, and (2) where knowledge of the correlation between samples cannot enhance estimation efficiency. When knowledge of certain correlations is unavailable, but collaboration remains potentially beneficial, we propose novel approaches that apply multi-armed bandit algorithms to learn the optimal data collection and collaboration policy in our sequential distributed parameter estimation problem. We illustrate the effectiveness of the proposed algorithms, DOUBLE-F, DOUBLE-Z, UCB-F, UCB-Z, through simulation.
On the Principles behind Opinion Dynamics in Multi-Agent Systems of Large Language Models
We study the evolution of opinions inside a population of interacting large language models (LLMs). Every LLM needs to decide how much funding to allocate to an item with three initial possibilities: full, partial, or no funding. We identify biases that drive the exchange of opinions based on the LLM's tendency to find consensus with the other LLM's opinion, display caution when specifying funding, and consider ethical concerns in its opinion. We find these biases are affected by the perceived absence of compelling reasons for opinion change, the perceived willingness to engage in discussion, and the distribution of allocation values. Moreover, tensions among biases can lead to the survival of funding for items with negative connotations. We also find that the final distribution of full, partial, and no funding opinions is more diverse when an LLM freely forms its opinion after an interaction than when its opinion is a multiple-choice selection among the three allocation options. In the latter case, consensus is mostly attained. When agents are aware of past opinions, they seek to maintain consistency with them, changing the opinion dynamics. Our study is performed using Llama 3 and Mistral LLMs.
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming
Preference-based Reinforcement Learning (PbRL) has made significant strides in single-agent settings, but has not been studied for multi-agent frameworks. On the other hand, modeling cooperation between multiple agents, specifically, Human-AI Teaming settings while ensuring successful task completion is a challenging problem. To this end, we perform the first investigation of multi-agent PbRL by extending single-agent PbRL to the two-agent teaming settings and formulate it as a Human-AI PbRL Cooperation Game, where the RL agent queries the human-in-the-loop to elicit task objective and human's preferences on the joint team behavior. Under this game formulation, we first introduce the notion of Human Flexibility to evaluate team performance based on if humans prefer to follow a fixed policy or adapt to the RL agent on the fly. Secondly, we study the RL agent's varying access to the human policy. We highlight a special case along these two dimensions, which we call Specified Orchestration, where the human is least flexible and agent has complete access to human policy. We motivate the need for taking Human Flexibility into account and the usefulness of Specified Orchestration through a gamified user study. We evaluate state-of-the-art PbRL algorithms for Human-AI cooperative setups through robot locomotion based domains that explicitly require forced cooperation. Our findings highlight the challenges associated with PbRL by varying Human Flexibility and agent's access to the human policy. Finally, we draw insights from our user study and empirical results, and conclude that Specified Orchestration can be seen as an upper bound PbRL performance for future research in Human-AI teaming scenarios.
Robotics
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered. Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics. To address this limitation, we present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds. Our method leverages any interactive perception technique as a foundation for interactive perception, inducing slight object movement to generate point cloud frames of the evolving dynamic scene. These point clouds are then segmented using Segment Anything Model 2 (SAM2), after which the moving part of the object is masked for accurate motion online axis estimation, guiding subsequent robotic actions. Our approach significantly enhances the precision and efficiency of manipulation tasks involving articulated objects. Experiments in simulated environments demonstrate that our method outperforms baseline approaches, especially in tasks that demand precise axis-based control. Project Page: https://hytidel.github.io/video-tracking-for-axis-estimation/.
comment: Project Page: https://hytidel.github.io/video-tracking-for-axis-estimation/
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
How can robot manipulation policies generalize to novel tasks involving unseen object types and new motions? In this paper, we provide a solution in terms of predicting motion information from web data through human video generation and conditioning a robot policy on the generated video. Instead of attempting to scale robot data collection which is expensive, we show how we can leverage video generation models trained on easily available web data, for enabling generalization. Our approach Gen2Act casts language-conditioned manipulation as zero-shot human video generation followed by execution with a single policy conditioned on the generated video. To train the policy, we use an order of magnitude less robot interaction data compared to what the video prediction model was trained on. Gen2Act doesn't require fine-tuning the video model at all and we directly use a pre-trained model for generating human videos. Our results on diverse real-world scenarios show how Gen2Act enables manipulating unseen object types and performing novel motions for tasks not present in the robot data. Videos are at https://homangab.github.io/gen2act/
comment: Preprint. Under Review
Generative Factor Chaining: Coordinated Manipulation with Diffusion-based Factor Graph
Learning to plan for multi-step, multi-manipulator tasks is notoriously difficult because of the large search space and the complex constraint satisfaction problems. We present Generative Factor Chaining~(GFC), a composable generative model for planning. GFC represents a planning problem as a spatial-temporal factor graph, where nodes represent objects and robots in the scene, spatial factors capture the distributions of valid relationships among nodes, and temporal factors represent the distributions of skill transitions. Each factor is implemented as a modular diffusion model, which are composed during inference to generate feasible long-horizon plans through bi-directional message passing. We show that GFC can solve complex bimanual manipulation tasks and exhibits strong generalization to unseen planning tasks with novel combinations of objects and constraints. More details can be found at: https://generative-fc.github.io/
comment: 28 pages, 17 figures, 2024 Conference on Robot Learning
REBEL: Rule-based and Experience-enhanced Learning with LLMs for Initial Task Allocation in Multi-Human Multi-Robot Teams
Multi-human multi-robot teams combine the complementary strengths of humans and robots to tackle complex tasks across diverse applications. However, the inherent heterogeneity of these teams presents significant challenges in initial task allocation (ITA), which involves assigning the most suitable tasks to each team member based on their individual capabilities before task execution. While current learning-based methods have shown promising results, they are often computationally expensive to train, and lack the flexibility to incorporate user preferences in multi-objective optimization and adapt to last-minute changes in real-world dynamic environments. To address these issues, we propose REBEL, an LLM-based ITA framework that integrates rule-based and experience-enhanced learning. By leveraging Retrieval-Augmented Generation, REBEL dynamically retrieves relevant rules and past experiences, enhancing reasoning efficiency. Additionally, REBEL can complement pre-trained RL-based ITA policies, improving situational awareness and overall team performance. Extensive experiments validate the effectiveness of our approach across various settings. More details are available at https://sites.google.com/view/ita-rebel .
Fast Extrinsic Calibration for Multiple Inertial Measurement Units in Visual-Inertial System
In this paper, we propose a fast extrinsic calibration method for fusing multiple inertial measurement units (MIMU) to improve visual-inertial odometry (VIO) localization accuracy. Currently, data fusion algorithms for MIMU highly depend on the number of inertial sensors. Based on the assumption that extrinsic parameters between inertial sensors are perfectly calibrated, the fusion algorithm provides better localization accuracy with more IMUs, while neglecting the effect of extrinsic calibration error. Our method builds two non-linear least-squares problems to estimate the MIMU relative position and orientation separately, independent of external sensors and inertial noises online estimation. Then we give the general form of the virtual IMU (VIMU) method and propose its propagation on manifold. We perform our method on datasets, our self-made sensor board, and board with different IMUs, validating the superiority of our method over competing methods concerning speed, accuracy, and robustness. In the simulation experiment, we show that only fusing two IMUs with our calibration method to predict motion can rival nine IMUs. Real-world experiments demonstrate better localization accuracy of the VIO integrated with our calibration method and VIMU propagation on manifold.
Tiny Robotics Dataset and Benchmark for Continual Object Detection
Detecting objects in mobile robotics is crucial for numerous applications, from autonomous navigation to inspection. However, robots are often required to perform tasks in different domains with respect to the training one and need to adapt to these changes. Tiny mobile robots, subject to size, power, and computational constraints, encounter even more difficulties in running and adapting these algorithms. Such adaptability, though, is crucial for real-world deployment, where robots must operate effectively in dynamic and unpredictable settings. In this work, we introduce a novel benchmark to evaluate the continual learning capabilities of object detection systems in tiny robotic platforms. Our contributions include: (i) Tiny Robotics Object Detection (TiROD), a comprehensive dataset collected using a small mobile robot, designed to test the adaptability of object detectors across various domains and classes; (ii) an evaluation of state-of-the-art real-time object detectors combined with different continual learning strategies on this dataset, providing detailed insights into their performance and limitations; and (iii) we publish the data and the code to replicate the results to foster continuous advancements in this field. Our benchmark results indicate key challenges that must be addressed to advance the development of robust and efficient object detection systems for tiny robotics.
comment: Paper under review
TE-PINN: Quaternion-Based Orientation Estimation using Transformer-Enhanced Physics-Informed Neural Networks
This paper introduces a Transformer-Enhanced Physics-Informed Neural Network (TE-PINN) designed for accurate quaternion-based orientation estimation in high-dynamic environments, particularly within the field of robotics. By integrating transformer networks with physics-informed learning, our approach innovatively captures temporal dependencies in sensor data while enforcing the fundamental physical laws governing rotational motion. TE-PINN leverages a multi-head attention mechanism to handle sequential data from inertial sensors, such as accelerometers and gyroscopes, ensuring temporal consistency. Simultaneously, the model embeds quaternion kinematics and rigid body dynamics into the learning process, aligning the network's predictions with mechanical principles like Euler's laws of motion. The physics-informed loss function incorporates the dynamics of angular velocity and external forces, enhancing the network's ability to generalize in complex scenarios. Our experimental evaluation demonstrates that TE-PINN consistently outperforms traditional methods such as Extended Kalman Filters (EKF) and LSTM-based estimators, particularly in scenarios characterized by high angular velocities and noisy sensor data. The results show a significant reduction in mean quaternion error and improved gyroscope bias estimation compared to the state-of-the-art. An ablation study further isolates the contributions of both the transformer architecture and the physics-informed constraints, highlighting the synergistic effect of both components in improving model performance. The proposed model achieves real-time performance on embedded systems typical of mobile robots, offering a scalable and efficient solution for orientation estimation in autonomous systems.
Context-Based Meta Reinforcement Learning for Robust and Adaptable Peg-in-Hole Assembly Tasks ICRA 2025
Peg-in-hole assembly in unknown environments is a challenging task due to onboard sensor errors, which result in uncertainty and variations in task parameters such as the hole position and orientation. Meta Reinforcement Learning (Meta RL) has been proposed to mitigate this problem as it learns how to quickly adapt to new tasks with different parameters. However, previous approaches either depend on a sample-inefficient procedure or human demonstrations to perform the task in the real world. Our work modifies the data used by the Meta RL agent and uses simple features that can be easily measured in the real world even with an uncalibrated camera. We further adapt the Meta RL agent to use data from a force/torque sensor, instead of the camera, to perform the assembly, using a small amount of training data. Finally, we propose a fine-tuning method that consistently and safely adapts to out-of-distribution tasks with parameters that differ by a factor of 10 from the training tasks. Our results demonstrate that the proposed data modification significantly enhances the training and adaptation efficiency and enables the agent to achieve 100% success in tasks with different hole positions and orientations. Experiments on a real robot confirm that both camera- and force/torque sensor-equipped agents achieve 100% success in tasks with unknown hole positions, matching their simulation performance and validating the approach's robustness and applicability. Compared to the previous work with sample-inefficient adaptation, our proposed methods are 10 times more sample-efficient in the real-world tasks.
comment: 8 pages, 9 figures, submitted to ICRA 2025
A Universal Multi-Vehicle Cooperative Decision-Making Approach in Structured Roads by Mixed-Integer Potential Game
Due to the intricate of real-world road topologies and the inherent complexity of autonomous vehicles, cooperative decision-making for multiple connected autonomous vehicles (CAVs) remains a significant challenge. Currently, most methods are tailored to specific scenarios, and the efficiency of existing optimization and learning methods applicable to diverse scenarios is hindered by the complexity of modeling and data dependency, which limit their real-world applicability. To address these issues, this paper proposes a universal multi-vehicle cooperative decision-making method in structured roads with game theory. We transform the decision-making problem into a graph path searching problem within a way-point graph framework. The problem is formulated as a mixed-integer linear programming problem (MILP) first and transformed into a mixed-integer potential game (MIPG), which reduces the scope of problem and ensures that no player needs to sacrifice for the overall cost. Two Gauss-Seidel algorithms for cooperative decision-making are presented to solve the MIPG problem and obtain the Nash equilibrium solutions. Specifically, the sequential Gauss-Seidel algorithm for cooperative decision-making considers the varying degrees of CAV interactions and flexibility in adjustment strategies to determine optimization priorities, which reduces the frequency of ineffective optimizations. Experimental evaluations across various urban traffic scenarios with different topological structures demonstrate the effectiveness and efficiency of the proposed method compared with MILP and comparisons of different optimization sequences validate the efficiency of the sequential Gauss-Seidel algorithm for cooperative decision-making.
SPIBOT: A Drone-Tethered Mobile Gripper for Robust Aerial Object Retrieval in Dynamic Environments
In real-world field operations, aerial grasping systems face significant challenges in dynamic environments due to strong winds, shifting surfaces, and the need to handle heavy loads. Particularly when dealing with heavy objects, the powerful propellers of the drone can inadvertently blow the target object away as it approaches, making the task even more difficult. To address these challenges, we introduce SPIBOT, a novel drone-tethered mobile gripper system designed for robust and stable autonomous target retrieval. SPIBOT operates via a tether, much like a spider, allowing the drone to maintain a safe distance from the target. To ensure both stable mobility and secure grasping capabilities, SPIBOT is equipped with six legs and sensors to estimate the robot's and mission's states. It is designed with a reduced volume and weight compared to other hexapod robots, allowing it to be easily stowed under the drone and reeled in as needed. Designed for the 2024 MBZIRC Maritime Grand Challenge, SPIBOT is built to retrieve a 1kg target object in the highly dynamic conditions of the moving deck of a ship. This system integrates a real-time action selection algorithm that dynamically adjusts the robot's actions based on proximity to the mission goal and environmental conditions, enabling rapid and robust mission execution. Experimental results across various terrains, including a pontoon on a lake, a grass field, and rubber mats on coastal sand, demonstrate SPIBOT's ability to efficiently and reliably retrieve targets. SPIBOT swiftly converges on the target and completes its mission, even when dealing with irregular initial states and noisy information introduced by the drone.
CloudTrack: Scalable UAV Tracking with Cloud Semantics
Nowadays, unmanned aerial vehicles (UAVs) are commonly used in search and rescue scenarios to gather information in the search area. The automatic identification of the person searched for in aerial footage could increase the autonomy of such systems, reduce the search time, and thus increase the missed person's chances of survival. In this paper, we present a novel approach to perform semantically conditioned open vocabulary object tracking that is specifically designed to cope with the limitations of UAV hardware. Our approach has several advantages. It can run with verbal descriptions of the missing person, e.g., the color of the shirt, it does not require dedicated training to execute the mission and can efficiently track a potentially moving person. Our experimental results demonstrate the versatility and efficacy of our approach.
comment: 7 pages, 3 figures
Real-time Planning of Minimum-time Trajectories for Agile UAV Flight
We address the challenge of real-time planning of minimum-time trajectories over multiple waypoints, onboard multirotor UAVs. Previous works demonstrated that achieving a truly time-optimal trajectory is computationally too demanding to enable frequent replanning during agile flight, especially on less powerful flight computers. Our approach overcomes this stumbling block by utilizing a point-mass model with a novel iterative thrust decomposition algorithm, enabling the UAV to use all of its collective thrust, something previous point-mass approaches could not achieve. The approach enables gravity and drag modeling integration, significantly reducing tracking errors in high-speed trajectories, which is proven through an ablation study. When combined with a new multi-waypoint optimization algorithm, which uses a gradient-based method to converge to optimal velocities in waypoints, the proposed method generates minimum-time multi-waypoint trajectories within milliseconds. The proposed approach, which we provide as open-source package, is validated both in simulation and in real-world, using Nonlinear Model Predictive Control. With accelerations of up to 3.5g and speeds over 100 km/h, trajectories generated by the proposed method yield similar or even smaller tracking errors than the trajectories generated for a full multirotor model.
Open-World Object Detection with Instance Representation Learning
While humans naturally identify novel objects and understand their relationships, deep learning-based object detectors struggle to detect and relate objects that are not observed during training. To overcome this issue, Open World Object Detection(OWOD) has been introduced to enable models to detect unknown objects in open-world scenarios. However, OWOD methods fail to capture the fine-grained relationships between detected objects, which are crucial for comprehensive scene understanding and applications such as class discovery and tracking. In this paper, we propose a method to train an object detector that can both detect novel objects and extract semantically rich features in open-world conditions by leveraging the knowledge of Vision Foundation Models(VFM). We first utilize the semantic masks from the Segment Anything Model to supervise the box regression of unknown objects, ensuring accurate localization. By transferring the instance-wise similarities obtained from the VFM features to the detector's instance embeddings, our method then learns a semantically rich feature space of these embeddings. Extensive experiments show that our method learns a robust and generalizable feature space, outperforming other OWOD-based feature extraction methods. Additionally, we demonstrate that the enhanced feature from our model increases the detector's applicability to tasks such as open-world tracking.
comment: Our project website can be found at https://sunohlee.github.io/OWODRep/
Whole-body end-effector pose tracking
Combining manipulation with the mobility of legged robots is essential for a wide range of robotic applications. However, integrating an arm with a mobile base significantly increases the system's complexity, making precise end-effector control challenging. Existing model-based approaches are often constrained by their modeling assumptions, leading to limited robustness. Meanwhile, recent Reinforcement Learning (RL) implementations restrict the arm's workspace to be in front of the robot or track only the position to obtain decent tracking accuracy. In this work, we address these limitations by introducing a whole-body RL formulation for end-effector pose tracking in a large workspace on rough, unstructured terrains. Our proposed method involves a terrain-aware sampling strategy for the robot's initial configuration and end-effector pose commands, as well as a game-based curriculum to extend the robot's operating range. We validate our approach on the ANYmal quadrupedal robot with a six DoF robotic arm. Through our experiments, we show that the learned controller achieves precise command tracking over a large workspace and adapts across varying terrains such as stairs and slopes. On deployment, it achieves a pose-tracking error of 2.64 cm and 3.64 degrees, outperforming existing competitive baselines.
RTAGrasp: Learning Task-Oriented Grasping from Human Videos via Retrieval, Transfer, and Alignment
Task-oriented grasping (TOG) is crucial for robots to accomplish manipulation tasks, requiring the determination of TOG positions and directions. Existing methods either rely on costly manual TOG annotations or only extract coarse grasping positions or regions from human demonstrations, limiting their practicality in real-world applications. To address these limitations, we introduce RTAGrasp, a Retrieval, Transfer, and Alignment framework inspired by human grasping strategies. Specifically, our approach first effortlessly constructs a robot memory from human grasping demonstration videos, extracting both TOG position and direction constraints. Then, given a task instruction and a visual observation of the target object, RTAGrasp retrieves the most similar human grasping experience from its memory and leverages semantic matching capabilities of vision foundation models to transfer the TOG constraints to the target object in a training-free manner. Finally, RTAGrasp aligns the transferred TOG constraints with the robot's action for execution. Evaluations on the public TOG benchmark, TaskGrasp dataset, show the competitive performance of RTAGrasp on both seen and unseen object categories compared to existing baseline methods. Real-world experiments further validate its effectiveness on a robotic arm. Our code, appendix, and video are available at \url{https://sites.google.com/view/rtagrasp/home}.
MHRC: Closed-loop Decentralized Multi-Heterogeneous Robot Collaboration with Large Language Models
The integration of large language models (LLMs) with robotics has significantly advanced robots' abilities in perception, cognition, and task planning. The use of natural language interfaces offers a unified approach for expressing the capability differences of heterogeneous robots, facilitating communication between them, and enabling seamless task allocation and collaboration. Currently, the utilization of LLMs to achieve decentralized multi-heterogeneous robot collaborative tasks remains an under-explored area of research. In this paper, we introduce a novel framework that utilizes LLMs to achieve decentralized collaboration among multiple heterogeneous robots. Our framework supports three robot categories, mobile robots, manipulation robots, and mobile manipulation robots, working together to complete tasks such as exploration, transportation, and organization. We developed a rich set of textual feedback mechanisms and chain-of-thought (CoT) prompts to enhance task planning efficiency and overall system performance. The mobile manipulation robot can adjust its base position flexibly, ensuring optimal conditions for grasping tasks. The manipulation robot can comprehend task requirements, seek assistance when necessary, and handle objects appropriately. Meanwhile, the mobile robot can explore the environment extensively, map object locations, and communicate this information to the mobile manipulation robot, thus improving task execution efficiency. We evaluated the framework using PyBullet, creating scenarios with three different room layouts and three distinct operational tasks. We tested various LLM models and conducted ablation studies to assess the contributions of different modules. The experimental results confirm the effectiveness and necessity of our proposed framework.
AIR-Embodied: An Efficient Active 3DGS-based Interaction and Reconstruction Framework with Embodied Large Language Model
Recent advancements in 3D reconstruction and neural rendering have enhanced the creation of high-quality digital assets, yet existing methods struggle to generalize across varying object shapes, textures, and occlusions. While Next Best View (NBV) planning and Learning-based approaches offer solutions, they are often limited by predefined criteria and fail to manage occlusions with human-like common sense. To address these problems, we present AIR-Embodied, a novel framework that integrates embodied AI agents with large-scale pretrained multi-modal language models to improve active 3DGS reconstruction. AIR-Embodied utilizes a three-stage process: understanding the current reconstruction state via multi-modal prompts, planning tasks with viewpoint selection and interactive actions, and employing closed-loop reasoning to ensure accurate execution. The agent dynamically refines its actions based on discrepancies between the planned and actual outcomes. Experimental evaluations across virtual and real-world environments demonstrate that AIR-Embodied significantly enhances reconstruction efficiency and quality, providing a robust solution to challenges in active 3D reconstruction.
PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation ICRA 2025
We introduce a learning-guided motion planning framework that provides initial seed trajectories using a diffusion model for trajectory optimization. Given a workspace, our method approximates the configuration space (C-space) obstacles through a key-configuration representation that consists of a sparse set of task-related key configurations, and uses this as an input to the diffusion model. The diffusion model integrates regularization terms that encourage collision avoidance and smooth trajectories during training, and trajectory optimization refines the generated seed trajectories to further correct any colliding segments. Our experimental results demonstrate that using high-quality trajectory priors, learned through our C-space-grounded diffusion model, enables efficient generation of collision-free trajectories in narrow-passage environments, outperforming prior learning- and planning-based baselines. Videos and additional materials can be found on the project page: https://kiwi-sherbet.github.io/PRESTO.
comment: Submitted to ICRA 2025
CrowdSurfer: Sampling Optimization Augmented with Vector-Quantized Variational AutoEncoder for Dense Crowd Navigation
Navigation amongst densely packed crowds remains a challenge for mobile robots. The complexity increases further if the environment layout changes, making the prior computed global plan infeasible. In this paper, we show that it is possible to dramatically enhance crowd navigation by just improving the local planner. Our approach combines generative modelling with inference time optimization to generate sophisticated long-horizon local plans at interactive rates. More specifically, we train a Vector Quantized Variational AutoEncoder to learn a prior over the expert trajectory distribution conditioned on the perception input. At run-time, this is used as an initialization for a sampling-based optimizer for further refinement. Our approach does not require any sophisticated prediction of dynamic obstacles and yet provides state-of-the-art performance. In particular, we compare against the recent DRL-VO approach and show a 40% improvement in success rate and a 6% improvement in travel time.
Investigating the Impact of Trust in Multi-Human Multi-Robot Task Allocation
Trust is essential in human-robot collaboration. Even more so in multi-human multi-robot teams where trust is vital to ensure teaming cohesion in complex operational environments. Yet, at the moment, trust is rarely considered a factor during task allocation and reallocation in algorithms used in multi-human, multi-robot collaboration contexts. Prior work on trust in single-human-robot interaction has identified that including trust as a parameter in human-robot interaction significantly improves both performance outcomes and human experience with robotic systems. However, very little research has explored the impact of trust in multi-human multi-robot collaboration, specifically in the context of task allocation. In this paper, we introduce a new trust model, the Expectation Comparison Trust (ECT) model, and employ it with three trust models from prior work and a baseline no-trust model to investigate the impact of trust on task allocation outcomes in multi-human multi-robot collaboration. Our experiment involved different team configurations, including 2 humans, 2 robots, 5 humans, 5 robots, and 10 humans, 10 robots. Results showed that using trust-based models generally led to better task allocation outcomes in larger teams (10 humans and 10 robots) than in smaller teams. We discuss the implications of our findings and provide recommendations for future work on integrating trust as a variable for task allocation in multi-human, multi-robot collaboration.
Overcoming Reward Model Noise in Instruction-Guided Reinforcement Learning
Vision-language models (VLMs) have gained traction as auxiliary reward models to provide more informative reward signals in sparse reward environments. However, our work reveals a critical vulnerability of this method: a small amount of noise in the reward signal can severely degrade agent performance. In challenging environments with sparse rewards, we show that reinforcement learning agents using VLM-based reward models without proper noise handling perform worse than agents relying solely on exploration-driven methods. We hypothesize that false positive rewards -- where the reward model incorrectly assigns rewards to trajectories that do not fulfill the given instruction -- are more detrimental to learning than false negatives. Our analysis confirms this hypothesis, revealing that the widely used cosine similarity metric, when applied to comparing agent trajectories and language instructions, is prone to generating false positive reward signals. To address this, we introduce BiMI (Binary Mutual Information), a novel noise-resilient reward function. Our experiments demonstrate that, BiMI significantly boosts the agent performance, with an average improvement ratio of 44.5\% across diverse environments with learned, non-oracle VLMs, thereby making VLM-based reward models practical for real-world applications.
comment: 9 main body pages, 7 appendix pages
Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning
Multi-UAV pursuit-evasion, where pursuers aim to capture evaders, poses a key challenge for UAV swarm intelligence. Multi-agent reinforcement learning (MARL) has demonstrated potential in modeling cooperative behaviors, but most RL-based approaches remain constrained to simplified simulations with limited dynamics or fixed scenarios. Previous attempts to deploy RL policy to real-world pursuit-evasion are largely restricted to two-dimensional scenarios, such as ground vehicles or UAVs at fixed altitudes. In this paper, we address multi-UAV pursuit-evasion by considering UAV dynamics and physical constraints. We introduce an evader prediction-enhanced network to tackle partial observability in cooperative strategy learning. Additionally, we propose an adaptive environment generator within MARL training, enabling higher exploration efficiency and better policy generalization across diverse scenarios. Simulations show our method significantly outperforms all baselines in challenging scenarios, generalizing to unseen scenarios with a 100\% capture rate. Finally, we derive a feasible policy via a two-stage reward refinement and deploy the policy on real quadrotors in a zero-shot manner. To our knowledge, this is the first work to derive and deploy an RL-based policy using collective thrust and body rates control commands for multi-UAV pursuit-evasion in unknown environments. The open-source code and videos are available at https://sites.google.com/view/pursuit-evasion-rl.
BeSimulator: A Large Language Model Powered Text-based Behavior Simulator
Traditional robot simulators focus on physical process modeling and realistic rendering, often suffering from high computational costs, inefficiencies, and limited adaptability. To handle this issue, we propose Behavior Simulation in robotics to emphasize checking the behavior logic of robots and achieving sufficient alignment between the outcome of robot actions and real scenarios. In this paper, we introduce BeSimulator, a modular and novel LLM-powered framework, as an attempt towards behavior simulation in the context of text-based environments. By constructing text-based virtual environments and performing semantic-level simulation, BeSimulator can generalize across scenarios and achieve long-horizon complex simulation. Inspired by human cognition processes, it employs a "consider-decide-capture-transfer" methodology, termed Chain of Behavior Simulation, which excels at analyzing action feasibility and state transitions. Additionally, BeSimulator incorporates code-driven reasoning to enable arithmetic operations and enhance reliability, as well as integrates reflective feedback to refine simulation. Based on our manually constructed behavior-tree-based simulation benchmark BTSIMBENCH, our experiments show a significant performance improvement in behavior simulation compared to baselines, ranging from 14.7% to 26.6%.
comment: 7 pages, 3 figures, 2 tables
Distance-based Multiple Non-cooperative Ground Target Encirclement for Complex Environments
This paper proposes a comprehensive strategy for complex multi-target-multi-drone encirclement in an obstacle-rich and GPS-denied environment, motivated by practical scenarios such as pursuing vehicles or humans in urban canyons. The drones have omnidirectional range sensors that can robustly detect ground targets and obtain noisy relative distances. After each drone task is assigned, a novel distance-based target state estimator (DTSE) is proposed by estimating the measurement output noise variance and utilizing the Kalman filter. By integrating anti-synchronization techniques and pseudo-force functions, an acceleration controller enables two tasking drones to cooperatively encircle a target from opposing positions while navigating obstacles. The algorithms effectiveness for the discrete-time double-integrator system is established theoretically, particularly regarding observability. Moreover, the versatility of the algorithm is showcased in aerial-to-ground scenarios, supported by compelling simulation results. Experimental validation demonstrates the effectiveness of the proposed approach.
TiltXter: CNN-based Electro-tactile Rendering of Tilt Angle for Telemanipulation of Pasteur Pipettes
The shape of deformable objects can change drastically during grasping by robotic grippers, causing an ambiguous perception of their alignment and hence resulting in errors in robot positioning and telemanipulation. Rendering clear tactile patterns is fundamental to increasing users' precision and dexterity through tactile haptic feedback during telemanipulation. Therefore, different methods have to be studied to decode the sensors' data into haptic stimuli. This work presents a telemanipulation system for plastic pipettes that consists of a Force Dimension Omega.7 haptic interface endowed with two electro-stimulation arrays and two tactile sensor arrays embedded in the 2-finger Robotiq gripper. We propose a novel approach based on convolutional neural networks (CNN) to detect the tilt of deformable objects. The CNN generates a tactile pattern based on recognized tilt data to render further electro-tactile stimuli provided to the user during the telemanipulation. The study has shown that using the CNN algorithm, tilt recognition by users increased from 23.13\% with the downsized data to 57.9%, and the success rate during teleoperation increased from 53.12% using the downsized data to 92.18% using the tactile patterns generated by the CNN.
comment: Manuscript accepted to IEEE Telepresence 2024. arXiv admin note: text overlap with arXiv:2204.03521 by other authors
A Ducted Fan UAV for Safe Aerial Grabbing and Transfer of Multiple Loads Using Electromagnets IROS2024
In recent years, research on aerial grasping, manipulation, and transportation of objects has garnered significant attention. These tasks often require UAVs to operate safely close to environments or objects and to efficiently grasp payloads. However, current widely adopted flying platforms pose safety hazards: unprotected high-speed rotating propellers can cause harm to the surroundings. Additionally, the space for carrying payloads on the fuselage is limited, and the restricted position of the payload also hinders efficient grasping. To address these issues, this paper presents a coaxial ducted fan UAV which is equipped with electromagnets mounted externally on the fuselage, enabling safe grasping and transfer of multiple loads in midair without complex additional actuators. It also has the capability to achieve direct human-UAV cargo transfer in the air. The forces acting on the loads during magnetic attachment and their influencing factors were analyzed. An ADRC controller is utilized to counteract disturbances during grasping and achieve attitude control. Finally, flight tests are conducted to verify the UAV's ability to directly grasp multiple loads from human hands in flight while maintaining attitude tracking.
comment: 8pages, 13figures,accepted by IROS2024 This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Intention-based and Risk-Aware Trajectory Prediction for Autonomous Driving in Complex Traffic Scenarios
Accurately predicting the trajectory of surrounding vehicles is a critical challenge for autonomous vehicles. In complex traffic scenarios, there are two significant issues with the current autonomous driving system: the cognitive uncertainty of prediction and the lack of risk awareness, which limit the further development of autonomous driving. To address this challenge, we introduce a novel trajectory prediction model that incorporates insights and principles from driving behavior, ethical decision-making, and risk assessment. Based on joint prediction, our model consists of interaction, intention, and risk assessment modules. The dynamic variation of interaction between vehicles can be comprehensively captured at each timestamp in the interaction module. Based on interaction information, our model considers primary intentions for vehicles to enhance the diversity of trajectory generation. The optimization of predicted trajectories follows the advanced risk-aware decision-making principles. Experimental results are evaluated on the DeepAccident dataset; our approach shows its remarkable prediction performance on normal and accident scenarios and outperforms the state-of-the-art algorithms by at least 28.9\% and 26.5\%, respectively. The proposed model improves the proficiency and adaptability of trajectory prediction in complex traffic scenarios. The code for the proposed model is available at https://sites.google.com/view/ir-prediction.
A Computer Vision Approach for Autonomous Cars to Drive Safe at Construction Zone
To build a smarter and safer city, a secure, efficient, and sustainable transportation system is a key requirement. The autonomous driving system (ADS) plays an important role in the development of smart transportation and is considered one of the major challenges facing the automotive sector in recent decades. A car equipped with an autonomous driving system (ADS) comes with various cutting-edge functionalities such as adaptive cruise control, collision alerts, automated parking, and more. A primary area of research within ADAS involves identifying road obstacles in construction zones regardless of the driving environment. This paper presents an innovative and highly accurate road obstacle detection model utilizing computer vision technology that can be activated in construction zones and functions under diverse drift conditions, ultimately contributing to build a safer road transportation system. The model developed with the YOLO framework achieved a mean average precision exceeding 94\% and demonstrated an inference time of 1.6 milliseconds on the validation dataset, underscoring the robustness of the methodology applied to mitigate hazards and risks for autonomous vehicles.
comment: 6 Pages, Double columns
Development of Bidirectional Series Elastic Actuator with Torsion Coil Spring and Implementation to the Legged Robot
Many studies have been conducted on Series Elastic Actuators (SEA) for robot joints because they are effective in terms of flexibility, safety, and energy efficiency. The ability of SEA to robustly handle unexpected disturbances has raised expectations for practical applications in environments where robots interact with humans. On the other hand, the development and commercialization of small robots for indoor entertainment applications is also actively underway, and it is thought that by using SEA in these robots, dynamic movements such as jumping and running can be realized. In this work, we developed a small and lightweight SEA using coil springs as elastic elements. By devising a method for fixing the coil spring, it is possible to absorb shock and perform highly accurate force measurement in both rotational directions with a simple structure. In addition, to verify the effectiveness of the developed SEA, we created a small single-legged robot with SEA implemented in the three joints of the hip, knee, and ankle, and we conducted a drop test. By adjusting the initial posture and control gain of each joint, we confirmed that flexible landing and continuous hopping are possible with simple PD position control. The measurement results showed that SEA is effective in terms of shock absorption and energy reuse. This work was performed for research purposes only.
comment: 6 pages
Improving behavior profile discovery for vehicles
Multiple approaches have already been proposed to mimic real driver behaviors in simulation. This article proposes a new one, based solely on the exploration of undisturbed observation of intersections. From them, the behavior profiles for each macro-maneuver will be discovered. Using the macro-maneuvers already identified in previous works, a comparison method between trajectories with different lengths using an Extended Kalman Filter (EKF) is proposed, which combined with an Expectation-Maximization (EM) inspired method, defines the different clusters that represent the behaviors observed. This is also paired with a Kullback-Liebler divergent (KL) criteria to define when the clusters need to be split or merged. Finally, the behaviors for each macro-maneuver are determined by each cluster discovered, without using any map information about the environment and being dynamically consistent with vehicle motion. By observation it becomes clear that the two main factors for driver's behavior are their assertiveness and interaction with other road users.
AnyCar to Anywhere: Learning Universal Dynamics Model for Agile and Adaptive Mobility ICRA 2025
Recent works in the robot learning community have successfully introduced generalist models capable of controlling various robot embodiments across a wide range of tasks, such as navigation and locomotion. However, achieving agile control, which pushes the limits of robotic performance, still relies on specialist models that require extensive parameter tuning. To leverage generalist-model adaptability and flexibility while achieving specialist-level agility, we propose AnyCar, a transformer-based generalist dynamics model designed for agile control of various wheeled robots. To collect training data, we unify multiple simulators and leverage different physics backends to simulate vehicles with diverse sizes, scales, and physical properties across various terrains. With robust training and real-world fine-tuning, our model enables precise adaptation to different vehicles, even in the wild and under large state estimation errors. In real-world experiments, AnyCar shows both few-shot and zero-shot generalization across a wide range of vehicles and environments, where our model, combined with a sampling-based MPC, outperforms specialist models by up to 54%. These results represent a key step toward building a foundation model for agile wheeled robot control. We will also open-source our framework to support further research.
comment: Paper website: https://lecar-lab.github.io/anycar/. The first two authors have equal contribution. Under review at ICRA 2025
A Robust, Task-Agnostic and Fully-Scalable Voxel Mapping System for Large Scale Environments
Perception still remains a challenging problem for autonomous navigation in unknown environment, especially for aerial vehicles. Most mapping algorithms for autonomous navigation are specifically designed for their very intended task, which hinders extended usage or cooperative task. In this paper, we propose a voxel mapping system that can build an adaptable map for multiple tasks. The system employs hash table-based map structure and manages each voxel with spatial and temporal priorities without explicit map boundary. We also introduce an efficient map-sharing feature with minimal bandwidth to enable multi-agent applications. We tested the system in real world and simulation environment by applying it for various tasks including local mapping, global mapping, cooperative multi-agent navigation, and high-speed navigation. Our system proved its capability to build customizable map with high resolution, wide coverage, and real-time performance regardless of sensor and environment. The system can build a full-resolution map using the map-sharing feature, with over 95 % of bandwidth reduction from raw sensor data.
comment: 8 pages, 6 figures, 3 tables
Bi-Level Belief Space Search for Compliant Part Mating Under Uncertainty
The problem of mating two parts with low clearance remains difficult for autonomous robots. We present bi-level belief assembly (BILBA), a model-based planner that computes a sequence of compliant motions which can leverage contact with the environment to reduce uncertainty and perform challenging assembly tasks with low clearance. Our approach is based on first deriving candidate contact schedules from the structure of the configuration space obstacle of the parts and then finding compliant motions that achieve the desired contacts. We demonstrate that BILBA can efficiently compute robust plans on multiple simulated tasks as well as a real robot rectangular peg-in-hole insertion task.
Stage-Wise Reward Shaping for Acrobatic Robots: A Constrained Multi-Objective Reinforcement Learning Approach
As the complexity of tasks addressed through reinforcement learning (RL) increases, the definition of reward functions also has become highly complicated. We introduce an RL method aimed at simplifying the reward-shaping process through intuitive strategies. Initially, instead of a single reward function composed of various terms, we define multiple reward and cost functions within a constrained multi-objective RL (CMORL) framework. For tasks involving sequential complex movements, we segment the task into distinct stages and define multiple rewards and costs for each stage. Finally, we introduce a practical CMORL algorithm that maximizes objectives based on these rewards while satisfying constraints defined by the costs. The proposed method has been successfully demonstrated across a variety of acrobatic tasks in both simulation and real-world environments. Additionally, it has been shown to successfully perform tasks compared to existing RL and constrained RL algorithms. Our code is available at https://github.com/rllab-snu/Stage-Wise-CMORL.
comment: 7 pages
SoMaSLAM: 2D Graph SLAM for Sparse Range Sensing with Soft Manhattan World Constraints
We propose a graph SLAM algorithm for sparse range sensing that incorporates a soft Manhattan world utilizing landmark-landmark constraints. Sparse range sensing is necessary for tiny robots that do not have the luxury of using heavy and expensive sensors. Existing SLAM methods dealing with sparse range sensing lack accuracy and accumulate drift error over time due to limited access to data points. Algorithms that cover this flaw using structural regularities, such as the Manhattan world (MW), have shortcomings when mapping real-world environments that do not coincide with the rules. We propose SoMaSLAM, a 2D graph SLAM designed for tiny robots with sparse range sensing. Our approach effectively maps sparse range data without enforcing strict structural regularities and maintains an adaptive graph. We implement the MW assumption as soft constraints, which we refer to as a soft Manhattan world. We propose novel soft landmark-landmark constraints to incorporate the soft MW into graph SLAM. Through extensive evaluation, we demonstrate that our proposed SoMaSLAM method improves localization accuracy on diverse datasets and is flexible enough to be used in the real world. We release our source code and sparse range datasets at https://SoMaSLAM.github.io/.
comment: 7 pages including references, 11 figures
Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving
The autoregressive world model exhibits robust generalization capabilities in vectorized scene understanding but encounters difficulties in deriving actions due to insufficient uncertainty modeling and self-delusion. In this paper, we explore the feasibility of deriving decisions from an autoregressive world model by addressing these challenges through the formulation of multiple probabilistic hypotheses. We propose LatentDriver, a framework models the environment's next states and the ego vehicle's possible actions as a mixture distribution, from which a deterministic control signal is then derived. By incorporating mixture modeling, the stochastic nature of decisionmaking is captured. Additionally, the self-delusion problem is mitigated by providing intermediate actions sampled from a distribution to the world model. Experimental results on the recently released close-loop benchmark Waymax demonstrate that LatentDriver surpasses state-of-the-art reinforcement learning and imitation learning methods, achieving expert-level performance. The code and models will be made available at https://github.com/Sephirex-X/LatentDriver.
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC ICRA
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses. In contrast, the RL actor risked damaging the machine and was unsuitable for real-world use.
comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
Autotuning Bipedal Locomotion MPC with GRFM-Net for Efficient Sim-to-Real Transfer
Bipedal locomotion control is essential for humanoid robots to navigate complex, human-centric environments. While optimization-based control designs are popular for integrating sophisticated models of humanoid robots, they often require labor-intensive manual tuning. In this work, we address the challenges of parameter selection in bipedal locomotion control using DiffTune, a model-based autotuning method that leverages differential programming for efficient parameter learning. A major difficulty lies in balancing model fidelity with differentiability. We address this difficulty using a low-fidelity model for differentiability, enhanced by a Ground Reaction Force-and-Moment Network (GRFM-Net) to capture discrepancies between MPC commands and actual control effects. We validate the parameters learned by DiffTune with GRFM-Net in hardware experiments, which demonstrates the parameters' optimality in a multi-objective setting compared with baseline parameters, reducing the total loss by up to 40.5$\%$ compared with the expert-tuned parameters. The results confirm the GRFM-Net's effectiveness in mitigating the sim-to-real gap, improving the transferability of simulation-learned parameters to real hardware.
Walking with Terrain Reconstruction: Learning to Traverse Risky Sparse Footholds
Traversing risky terrains with sparse footholds presents significant challenges for legged robots, requiring precise foot placement in safe areas. Current learning-based methods often rely on implicit feature representations without supervising physically significant estimation targets. This limits the policy's ability to fully understand complex terrain structures, which is critical for generating accurate actions. In this paper, we utilize end-to-end reinforcement learning to traverse risky terrains with high sparsity and randomness. Our approach integrates proprioception with single-view depth images to reconstruct robot's local terrain, enabling a more comprehensive representation of terrain information. Meanwhile, by incorporating implicit and explicit estimations of the robot's state and its surroundings, we improve policy's environmental understanding, leading to more precise actions. We deploy the proposed framework on a low-cost quadrupedal robot, achieving agile and adaptive locomotion across various challenging terrains and demonstrating outstanding performance in real-world scenarios. Video at: http://youtu.be/ReQAR4D6tuc.
Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
With the increasing application of automated robotic digestive endoscopy (RDE), ensuring safe and efficient navigation in the unstructured and narrow digestive tract has become a critical challenge. Existing automated reinforcement learning navigation algorithms, often result in potentially risky collisions due to the absence of essential human intervention, which significantly limits the safety and effectiveness of RDE in actual clinical practice. To address this limitation, we proposed a Human Intervention (HI)-based Proximal Policy Optimization (PPO) framework, dubbed HI-PPO, which incorporates expert knowledge to enhance RDE's safety. Specifically, we introduce an Enhanced Exploration Mechanism (EEM) to address the low exploration efficiency of the standard PPO. Additionally, a reward-penalty adjustment (RPA) is implemented to penalize unsafe actions during initial interventions. Furthermore, Behavior Cloning Similarity (BCS) is included as an auxiliary objective to ensure the agent emulates expert actions. Comparative experiments conducted in a simulated platform across various anatomical colon segments demonstrate that our model effectively and safely guides RDE.
SYNERGAI: Perception Alignment for Human-Robot Collaboration
Recently, large language models (LLMs) have shown strong potential in facilitating human-robotic interaction and collaboration. However, existing LLM-based systems often overlook the misalignment between human and robot perceptions, which hinders their effective communication and real-world robot deployment. To address this issue, we introduce SYNERGAI, a unified system designed to achieve both perceptual alignment and human-robot collaboration. At its core, SYNERGAI employs 3D Scene Graph (3DSG) as its explicit and innate representation. This enables the system to leverage LLM to break down complex tasks and allocate appropriate tools in intermediate steps to extract relevant information from the 3DSG, modify its structure, or generate responses. Importantly, SYNERGAI incorporates an automatic mechanism that enables perceptual misalignment correction with users by updating its 3DSG with online interaction. SYNERGAI achieves comparable performance with the data-driven models in ScanQA in a zero-shot manner. Through comprehensive experiments across 10 real-world scenes, SYNERGAI demonstrates its effectiveness in establishing common ground with humans, realizing a success rate of 61.9% in alignment tasks. It also significantly improves the success rate from 3.7% to 45.68% on novel tasks by transferring the knowledge acquired during alignment.
comment: Project page: https://synerg-ai.github.io
Autonomous Hiking Trail Navigation via Semantic Segmentation and Geometric Analysis
Natural environments pose significant challenges for autonomous robot navigation, particularly due to their unstructured and ever-changing nature. Hiking trails, with their dynamic conditions influenced by weather, vegetation, and human traffic, represent one such challenge. This work introduces a novel approach to autonomous hiking trail navigation that balances trail adherence with the flexibility to adapt to off-trail routes when necessary. The solution is a Traversability Analysis module that integrates semantic data from camera images with geometric information from LiDAR to create a comprehensive understanding of the surrounding terrain. A planner uses this traversability map to navigate safely, adhering to trails while allowing off-trail movement when necessary to avoid on-trail hazards or for safe off-trail shortcuts. The method is evaluated through simulation to determine the balance between semantic and geometric information in traversability estimation. These simulations tested various weights to assess their impact on navigation performance across different trail scenarios. Weights were then validated through field tests at the West Virginia University Core Arboretum, demonstrating the method's effectiveness in a real-world environment.
ReLEP: A Novel Framework for Real-world Long-horizon Embodied Planning
Real-world long-horizon embodied planning underpins embodied AI. To accomplish long-horizon tasks, agents need to decompose abstract instructions into detailed steps. Prior works mostly rely on GPT-4V for task decomposition into predefined actions, which limits task diversity due to GPT-4V's finite understanding of larger skillsets. Therefore, we present ReLEP, a groundbreaking framework for Real world Long-horizon Embodied Planning, which can accomplish a wide range of daily tasks. At its core lies a fine-tuned large vision language model that formulates plans as sequences of skill functions according to input instruction and scene image. These functions are selected from a carefully designed skill library. ReLEP is also equipped with a Memory module for plan and status recall, and a Robot Configuration module for versatility across robot types. In addition, we propose a semi-automatic data generation pipeline to tackle dataset scarcity. Real-world off-line experiments across eight daily embodied tasks demonstrate that ReLEP is able to accomplish long-horizon embodied tasks and outperforms other state-of-the-art baseline methods.
SurgIRL: Towards Life-Long Learning for Surgical Automation by Incremental Reinforcement Learning
Surgical automation holds immense potential to improve the outcome and accessibility of surgery. Recent studies use reinforcement learning to learn policies that automate different surgical tasks. However, these policies are developed independently and are limited in their reusability when the task changes, making it more time-consuming when robots learn to solve multiple tasks. Inspired by how human surgeons build their expertise, we train surgical automation policies through Surgical Incremental Reinforcement Learning (SurgIRL). SurgIRL aims to (1) acquire new skills by referring to external policies (knowledge) and (2) accumulate and reuse these skills to solve multiple unseen tasks incrementally (incremental learning). Our SurgIRL framework includes three major components. We first define an expandable knowledge set containing heterogeneous policies that can be helpful for surgical tasks. Then, we propose Knowledge Inclusive Attention Network with mAximum Coverage Exploration (KIAN-ACE), which improves learning efficiency by maximizing the coverage of the knowledge set during the exploration process. Finally, we develop incremental learning pipelines based on KIAN-ACE to accumulate and reuse learned knowledge and solve multiple surgical tasks sequentially. Our simulation experiments show that KIAN-ACE efficiently learns to automate ten surgical tasks separately or incrementally. We also evaluate our learned policies on the da Vinci Research Kit (dVRK) and demonstrate successful sim-to-real transfers.
Dynamic Cloth Manipulation Considering Variable Stiffness and Material Change Using Deep Predictive Model with Parametric Bias
Dynamic manipulation of flexible objects such as fabric, which is difficult to modelize, is one of the major challenges in robotics. With the development of deep learning, we are beginning to see results in simulations and in some actual robots, but there are still many problems that have not yet been tackled. Humans can move their arms at high speed using their flexible bodies skillfully, and even when the material to be manipulated changes, they can manipulate the material after moving it several times and understanding its characteristics. Therefore, in this research, we focus on the following two points: (1) body control using a variable stiffness mechanism for more dynamic manipulation, and (2) response to changes in the material of the manipulated object using parametric bias. By incorporating these two approaches into a deep predictive model, we show through simulation and actual robot experiments that Musashi-W, a musculoskeletal humanoid with variable stiffness mechanism, can dynamically manipulate cloth while detecting changes in the physical properties of the manipulated object.
comment: Accepted at Frontiers in Neurorobotics
NavRL: Learning Safe Flight in Dynamic Environments
Safe flight in dynamic environments requires autonomous unmanned aerial vehicles (UAVs) to make effective decisions when navigating cluttered spaces with moving obstacles. Traditional approaches often decompose decision-making into hierarchical modules for prediction and planning. Although these handcrafted systems can perform well in specific settings, they might fail if environmental conditions change and often require careful parameter tuning. Additionally, their solutions could be suboptimal due to the use of inaccurate mathematical model assumptions and simplifications aimed at achieving computational efficiency. To overcome these limitations, this paper introduces the NavRL framework, a deep reinforcement learning-based navigation method built on the Proximal Policy Optimization (PPO) algorithm. NavRL utilizes our carefully designed state and action representations, allowing the learned policy to make safe decisions in the presence of both static and dynamic obstacles, with zero-shot transfer from simulation to real-world flight. Furthermore, the proposed method adopts a simple but effective safety shield for the trained policy, inspired by the concept of velocity obstacles, to mitigate potential failures associated with the black-box nature of neural networks. To accelerate the convergence, we implement the training pipeline using NVIDIA Isaac Sim, enabling parallel training with thousands of quadcopters. Simulation and physical experiments show that our method ensures safe navigation in dynamic environments and results in the fewest collisions compared to benchmarks in scenarios with dynamic obstacles.
comment: 8 pages, 9 figures, 3 tables. Experiment video: https://youtu.be/fhRxS--Rhkc
Intent Prediction-Driven Model Predictive Control for UAV Planning and Navigation in Dynamic Environments
The emergence of indoor aerial robots holds significant potential for enhancing construction site workers' productivity by autonomously performing inspection and mapping tasks. The key challenge to this application is ensuring navigation safety with human workers. While navigation in static environments has been extensively studied, navigating dynamic environments remains open due to challenges in perception and planning. Payload limitations of unmanned aerial vehicles limit them to using cameras with limited fields of view, resulting in unreliable perception and tracking during collision avoidance. Moreover, the unpredictable nature of the dynamic environments can quickly make the generated optimal trajectory outdated. To address these challenges, this paper presents a comprehensive navigation framework that incorporates both perception and planning, introducing the concept of dynamic obstacle intent prediction. Our perception module detects and tracks dynamic obstacles efficiently and handles tracking loss and occlusion during collision avoidance. The proposed intent prediction module employs a Markov Decision Process (MDP) to forecast potential actions of dynamic obstacles with the possible future trajectories. Finally, a novel intent-based planning algorithm, leveraging model predictive control (MPC), is applied to generate safe navigation trajectories. Simulation and physical experiments demonstrate that our method enables safe navigation in dynamic environments and achieves the fewest collisions compared to benchmarks.
comment: 8 pages, 8 figures, 3 tables, experiment video: https://youtu.be/UeBShELDzyM
Dynamic Game-Theoretical Decision-Making Framework for Vehicle-Pedestrian Interaction with Human Bounded Rationality
Human-involved interactive environments pose significant challenges for autonomous vehicle decision-making processes due to the complexity and uncertainty of human behavior. It is crucial to develop an explainable and trustworthy decision-making system for autonomous vehicles interacting with pedestrians. Previous studies often used traditional game theory to describe interactions for its interpretability. However, it assumes complete human rationality and unlimited reasoning abilities, which is unrealistic. To solve this limitation and improve model accuracy, this paper proposes a novel framework that integrates the partially observable markov decision process with behavioral game theory to dynamically model AV-pedestrian interactions at the unsignalized intersection. Both the AV and the pedestrian are modeled as dynamic-belief-induced quantal cognitive hierarchy (DB-QCH) models, considering human reasoning limitations and bounded rationality in the decision-making process. In addition, a dynamic belief updating mechanism allows the AV to update its understanding of the opponent's rationality degree in real-time based on observed behaviors and adapt its strategies accordingly. The analysis results indicate that our models effectively simulate vehicle-pedestrian interactions and our proposed AV decision-making approach performs well in safety, efficiency, and smoothness. It closely resembles real-world driving behavior and even achieves more comfortable driving navigation compared to our previous virtual reality experimental data.
ModCube: Modular, Self-Assembling Cubic Underwater Robot
This paper presents a low-cost, centralized modular underwater robot platform, ModCube, which can be used to study swarm coordination for a wide range of tasks in underwater environments. A ModCube structure consists of multiple ModCube robots. Each robot can move in six DoF with eight thrusters and can be rigidly connected to other ModCube robots with an electromagnet controlled by onboard computer. In this paper, we present a novel method for characterizing and visualizing dynamic behavior, along with four benchmarks to evaluate the morphological performance of the robot. Analysis shows that our ModCube design is desirable for omnidirectional tasks, compared with the configurations widely used by commercial underwater robots. We run real robot experiments in two water tanks to demonstrate the robust control and self-assemble of the proposed system, We also open-source the design and code to facilitate future research.
comment: 8 pages, 8 figures, letter
GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization
Although various visual localization approaches exist, such as scene coordinate and pose regression, these methods often struggle with high memory consumption or extensive optimization requirements. To address these challenges, we utilize recent advancements in novel view synthesis, particularly 3D Gaussian Splatting (3DGS), to enhance localization. 3DGS allows for the compact encoding of both 3D geometry and scene appearance with its spatial features. Our method leverages the dense description maps produced by XFeat's lightweight keypoint detection and description model. We propose distilling these dense keypoint descriptors into 3DGS to improve the model's spatial understanding, leading to more accurate camera pose predictions through 2D-3D correspondences. After estimating an initial pose, we refine it using a photometric warping loss. Benchmarking on popular indoor and outdoor datasets shows that our approach surpasses state-of-the-art Neural Render Pose (NRP) methods, including NeRFMatch and PNeRFLoc.
comment: Project website at https://gsplatloc.github.io/
Clarke Transform -- A Fundamental Tool for Continuum Robotics
This article introduces the Clarke transform and Clarke coordinates, which present a solution to the disengagement of an arbitrary number of coupled displacement actuation of continuum and soft robots. The Clarke transform utilizes the generalized Clarke transformation and its inverse to reduce any number of joint values to a two-dimensional space without sacrificing any significant information. This space is the manifold of the joint space and is described by two orthogonal Clarke coordinates. Application to kinematics, sampling, and control are presented. By deriving the solution to the previously unknown forward robot-dependent mapping for an arbitrary number of joints, the forward and inverse kinematics formulations are branchless, closed-form, and singular-free. Sampling is used as a proxy for gauging the performance implications for various methods and frameworks, leading to a branchless, closed-form, and vectorizable sampling method with a 100 percent success rate and the possibility to shape desired distributions. Due to the utilization of the manifold, the fairly simple constraint-informed, two-dimensional, and linear controller always provides feasible control outputs. On top of that, the relations to improved representations in continuum and soft robotics are established, where the Clarke coordinates are their generalizations. The Clarke transform offers valuable geometric insights and paves the way for developing approaches directly on the two-dimensional manifold within the high-dimensional joint space, ensuring compliance with the constraint. While being an easy-to-construct linear map, the proposed Clarke transform is mathematically consistent, physically meaningful, as well as interpretable and contributes to the unification of frameworks across continuum and soft robots.
comment: 27 pages, 11 figures, 5 tables
BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes
We present BehAV, a novel approach for autonomous robot navigation in outdoor scenes guided by human instructions and leveraging Vision Language Models (VLMs). Our method interprets human commands using a Large Language Model (LLM) and categorizes the instructions into navigation and behavioral guidelines. Navigation guidelines consist of directional commands (e.g., "move forward until") and associated landmarks (e.g., "the building with blue windows"), while behavioral guidelines encompass regulatory actions (e.g., "stay on") and their corresponding objects (e.g., "pavements"). We use VLMs for their zero-shot scene understanding capabilities to estimate landmark locations from RGB images for robot navigation. Further, we introduce a novel scene representation that utilizes VLMs to ground behavioral rules into a behavioral cost map. This cost map encodes the presence of behavioral objects within the scene and assigns costs based on their regulatory actions. The behavioral cost map is integrated with a LiDAR-based occupancy map for navigation. To navigate outdoor scenes while adhering to the instructed behaviors, we present an unconstrained Model Predictive Control (MPC)-based planner that prioritizes both reaching landmarks and following behavioral guidelines. We evaluate the performance of BehAV on a quadruped robot across diverse real-world scenarios, demonstrating a 22.49% improvement in alignment with human-teleoperated actions, as measured by Frechet distance, and achieving a 40% higher navigation success rate compared to state-of-the-art methods.
KinScene: Model-Based Mobile Manipulation of Articulated Scenes
Sequentially interacting with articulated objects is crucial for a mobile manipulator to operate effectively in everyday environments. To enable long-horizon tasks involving articulated objects, this study explores building scene-level articulation models for indoor scenes through autonomous exploration. While previous research has studied mobile manipulation with articulated objects by considering object kinematic constraints, it primarily focuses on individual-object scenarios and lacks extension to a scene-level context for task-level planning. To manipulate multiple object parts sequentially, the robot needs to reason about the resultant motion of each part and anticipate its impact on future actions.We introduce \ourtool{}, a full-stack approach for long-horizon manipulation tasks with articulated objects. The robot maps the scene, detects and physically interacts with articulated objects, collects observations, and infers the articulation properties. For sequential tasks, the robot plans a feasible series of object interactions based on the inferred articulation model. We demonstrate that our approach repeatably constructs accurate scene-level kinematic and geometric models, enabling long-horizon mobile manipulation in a real-world scene. Code and additional results are available at https://chengchunhsu.github.io/KinScene/
Frequency-based View Selection in Gaussian Splatting Reconstruction
Three-dimensional reconstruction is a fundamental problem in robotics perception. We examine the problem of active view selection to perform 3D Gaussian Splatting reconstructions with as few input images as possible. Although 3D Gaussian Splatting has made significant progress in image rendering and 3D reconstruction, the quality of the reconstruction is strongly impacted by the selection of 2D images and the estimation of camera poses through Structure-from-Motion (SfM) algorithms. Current methods to select views that rely on uncertainties from occlusions, depth ambiguities, or neural network predictions directly are insufficient to handle the issue and struggle to generalize to new scenes. By ranking the potential views in the frequency domain, we are able to effectively estimate the potential information gain of new viewpoints without ground truth data. By overcoming current constraints on model architecture and efficacy, our method achieves state-of-the-art results in view selection, demonstrating its potential for efficient image-based 3D reconstruction.
comment: 8 pages, 4 figures
Learning Dynamics of a Ball with Differentiable Factor Graph and Roto-Translational Invariant Representations ICRA 2025
Robots in dynamic environments need fast, accurate models of how objects move in their environments to support agile planning. In sports such as ping pong, analytical models often struggle to accurately predict ball trajectories with spins due to complex aerodynamics, elastic behaviors, and the challenges of modeling sliding and rolling friction. On the other hand, despite the promise of data-driven methods, machine learning struggles to make accurate, consistent predictions without precise input. In this paper, we propose an end-to-end learning framework that can jointly train a dynamics model and a factor graph estimator. Our approach leverages a Gram-Schmidt (GS) process to extract roto-translational invariant representations to improve the model performance, which can further reduce the validation error compared to data augmentation method. Additionally, we propose a network architecture that enhances nonlinearity by using self-multiplicative bypasses in the layer connections. By leveraging these novel methods, our proposed approach predicts the ball's position with an RMSE of 37.2 mm of the paddle radius at the apex after the first bounce, and 71.5 mm after the second bounce.
comment: ICRA 2025
Initialization of Monocular Visual Navigation for Autonomous Agents Using Modified Structure from Small Motion
We propose a standalone monocular visual Simultaneous Localization and Mapping (vSLAM) initialization pipeline for autonomous robots in space. Our method, a state-of-the-art factor graph optimization pipeline, enhances classical Structure from Small Motion (SfSM) to robustly initialize a monocular agent in weak-perspective projection scenes. Furthermore, it overcomes visual estimation challenges introduced by spacecraft inspection trajectories, such as: center-pointing motion, which exacerbates the bas-relief ambiguity, and the presence of a dominant plane in the scene, which causes motion estimation degeneracies in classical Structure from Motion (SfM). We validate our method on realistic, simulated satellite inspection images exhibiting weak-perspective projection, and we demonstrate its effectiveness and improved performance compared to other monocular initialization procedures.
comment: 6 pages, 1 page for references, 6 figures, 1 table, IEEEtran format This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
MBC: Multi-Brain Collaborative Control for Quadruped Robots
In the field of locomotion task of quadruped robots, Blind Policy and Perceptive Policy each have their own advantages and limitations. The Blind Policy relies on preset sensor information and algorithms, suitable for known and structured environments, but it lacks adaptability in complex or unknown environments. The Perceptive Policy uses visual sensors to obtain detailed environmental information, allowing it to adapt to complex terrains, but its effectiveness is limited under occluded conditions, especially when perception fails. Unlike the Blind Policy, the Perceptive Policy is not as robust under these conditions. To address these challenges, we propose a MBC:Multi-Brain collaborative system that incorporates the concepts of Multi-Agent Reinforcement Learning and introduces collaboration between the Blind Policy and the Perceptive Policy. By applying this multi-policy collaborative model to a quadruped robot, the robot can maintain stable locomotion even when the perceptual system is impaired or observational data is incomplete. Our simulations and real-world experiments demonstrate that this system significantly improves the robot's passability and robustness against perception failures in complex environments, validating the effectiveness of multi-policy collaboration in enhancing robotic motion performance.
comment: 18 pages, 9 figures, Website and Videos: https://quad-mbc.github.io/
MultiTalk: Introspective and Extrospective Dialogue for Human-Environment-LLM Alignment
LLMs have shown promising results in task planning due to their strong natural language understanding and reasoning capabilities. However, issues such as hallucinations, ambiguities in human instructions, environmental constraints, and limitations in the executing agent's capabilities often lead to flawed or incomplete plans. This paper proposes MultiTalk, an LLM-based task planning methodology that addresses these issues through a framework of introspective and extrospective dialogue loops. This approach helps ground generated plans in the context of the environment and the agent's capabilities, while also resolving uncertainties and ambiguities in the given task. These loops are enabled by specialized systems designed to extract and predict task-specific states, and flag mismatches or misalignments among the human user, the LLM agent, and the environment. Effective feedback pathways between these systems and the LLM planner foster meaningful dialogue. The efficacy of this methodology is demonstrated through its application to robotic manipulation tasks. Experiments and ablations highlight the robustness and reliability of our method, and comparisons with baselines further illustrate the superiority of MultiTalk in task planning for embodied agents.
comment: 7 pages, 3 figures
Hierarchical Hybrid Learning for Long-Horizon Contact-Rich Robotic Assembly
Generalizable long-horizon robotic assembly requires reasoning at multiple levels of abstraction. End-to-end imitation learning (IL) has been proven a promising approach, but it requires a large amount of demonstration data for training and often fails to meet the high-precision requirement of assembly tasks. Reinforcement Learning (RL) approaches have succeeded in high-precision assembly tasks, but suffer from sample inefficiency and hence, are less competent at long-horizon tasks. To address these challenges, we propose a hierarchical modular approach, named ARCH (Adaptive Robotic Composition Hierarchy), which enables long-horizon high-precision assembly in contact-rich settings. ARCH employs a hierarchical planning framework, including a low-level primitive library of continuously parameterized skills and a high-level policy. The low-level primitive library includes essential skills for assembly tasks, such as grasping and inserting. These primitives consist of both RL and model-based controllers. The high-level policy, learned via imitation learning from a handful of demonstrations, selects the appropriate primitive skills and instantiates them with continuous input parameters. We extensively evaluate our approach on a real robot manipulation platform. We show that while trained on a single task, ARCH generalizes well to unseen tasks and outperforms baseline methods in terms of success rate and data efficiency. Videos can be found at https://long-horizon-assembly.github.io.
Hand Gesture Classification Based on Forearm Ultrasound Video Snippets Using 3D Convolutional Neural Networks
Ultrasound based hand movement estimation is a crucial area of research with applications in human-machine interaction. Forearm ultrasound offers detailed information about muscle morphology changes during hand movement which can be used to estimate hand gestures. Previous work has focused on analyzing 2-Dimensional (2D) ultrasound image frames using techniques such as convolutional neural networks (CNNs). However, such 2D techniques do not capture temporal features from segments of ultrasound data corresponding to continuous hand movements. This study uses 3D CNN based techniques to capture spatio-temporal patterns within ultrasound video segments for gesture recognition. We compared the performance of a 2D convolution-based network with (2+1)D convolution-based, 3D convolution-based, and our proposed network. Our methodology enhanced the gesture classification accuracy to 98.8 +/- 0.9%, from 96.5 +/- 2.3% compared to a network trained with 2D convolution layers. These results demonstrate the advantages of using ultrasound video snippets for improving hand gesture classification performance.
comment: Accepted to IUS 2024
Improving Intersession Reproducibility for Forearm Ultrasound based Hand Gesture Classification through an Incremental Learning Approach
Ultrasound images of the forearm can be used to classify hand gestures towards developing human machine interfaces. In our previous work, we have demonstrated gesture classification using ultrasound on a single subject without removing the probe before evaluation. This has limitations in usage as once the probe is removed and replaced, the accuracy declines since the classifier performance is sensitive to the probe location on the arm. In this paper, we propose training a model on multiple data collection sessions to create a generalized model, utilizing incremental learning through fine tuning. Ultrasound data was acquired for 5 hand gestures within a session (without removing and putting the probe back on) and across sessions. A convolutional neural network (CNN) with 5 cascaded convolution layers was used for this study. A pre-trained CNN was fine tuned with the convolution blocks acting as a feature extractor, and the parameters of the remaining layers updated in an incremental fashion. Fine tuning was done using different session splits within a session and between multiple sessions. We found that incremental fine tuning can help enhance classification accuracy with more fine tuning sessions. After 2 fine tuning sessions for each experiment, we found an approximate 10% increase in classification accuracy. This work demonstrates that incremental learning through fine tuning on ultrasound based hand gesture classification can be used improves accuracy while saving storage, processing power, and time. It can be expanded to generalize between multiple subjects and towards developing personalized wearable devices.
comment: Accepted to IUS 2024
Vision-based Xylem Wetness Classification in Stem Water Potential Determination
Water is often overused in irrigation, making efficient management of it crucial. Precision Agriculture emphasizes tools like stem water potential (SWP) analysis for better plant status determination. However, such tools often require labor-intensive in-situ sampling. Automation and machine learning can streamline this process and enhance outcomes. This work focused on automating stem detection and xylem wetness classification using the Scholander Pressure Chamber, a widely used but demanding method for SWP measurement. The aim was to refine stem detection and develop computer-vision-based methods to better classify water emergence at the xylem. To this end, we collected and manually annotated video data, applying vision- and learning-based methods for detection and classification. Additionally, we explored data augmentation and fine-tuned parameters to identify the most effective models. The identified best-performing models for stem detection and xylem wetness classification were evaluated end-to-end over 20 SWP measurements. Learning-based stem detection via YOLOv8n combined with ResNet50-based classification achieved a Top-1 accuracy of 80.98%, making it the best-performing approach for xylem wetness classification.
Rao-Blackwellized POMDP Planning
Partially Observable Markov Decision Processes (POMDPs) provide a structured framework for decision-making under uncertainty, but their application requires efficient belief updates. Sequential Importance Resampling Particle Filters (SIRPF), also known as Bootstrap Particle Filters, are commonly used as belief updaters in large approximate POMDP solvers, but they face challenges such as particle deprivation and high computational costs as the system's state dimension grows. To address these issues, this study introduces Rao-Blackwellized POMDP (RB-POMDP) approximate solvers and outlines generic methods to apply Rao-Blackwellization in both belief updates and online planning. We compare the performance of SIRPF and Rao-Blackwellized Particle Filters (RBPF) in a simulated localization problem where an agent navigates toward a target in a GPS-denied environment using POMCPOW and RB-POMCPOW planners. Our results not only confirm that RBPFs maintain accurate belief approximations over time with fewer particles, but, more surprisingly, RBPFs combined with quadrature-based integration improve planning quality significantly compared to SIRPF-based planning under the same computational limits.
Embedded IPC: Fast and Intersection-free Simulation in Reduced Subspace for Robot Manipulation
Physics-based simulation is essential for developing and evaluating robot manipulation policies, particularly in scenarios involving deformable objects and complex contact interactions. However, existing simulators often struggle to balance computational efficiency with numerical accuracy, especially when modeling deformable materials with frictional contact constraints. We introduce an efficient subspace representation for the Incremental Potential Contact (IPC) method, leveraging model reduction to decrease the number of degrees of freedom. Our approach decouples simulation complexity from the resolution of the input model by representing elasticity in a low-resolution subspace while maintaining collision constraints on an embedded high-resolution surface. Our barrier formulation ensures intersection-free trajectories and configurations regardless of material stiffness, time step size, or contact severity. We validate our simulator through quantitative experiments with a soft bubble gripper grasping and qualitative demonstrations of placing a plate on a dish rack. The results demonstrate our simulator's efficiency, physical accuracy, computational stability, and robust handling of frictional contact, making it well-suited for generating demonstration data and evaluating downstream robot training applications.
Hierarchical Large Scale Multirobot Path (Re)Planning IROS2024
We consider a large-scale multi-robot path planning problem in a cluttered environment. Our approach achieves real-time replanning by dividing the workspace into cells and utilizing a hierarchical planner. Specifically, we propose novel multi-commodity flow-based high-level planners that route robots through cells with reduced congestion, along with an anytime low-level planner that computes collision-free paths for robots within each cell in parallel. A highlight of our method is a significant improvement in computation time. Specifically, we show empirical results of a 500-times speedup in computation time compared to the baseline multi-agent pathfinding approach on the environments we study. We account for the robot's embodiment and support non-stop execution with continuous replanning. We demonstrate the real-time performance of our algorithm with up to 142 robots in simulation, and a representative 32 physical Crazyflie nano-quadrotor experiment.
comment: 8 pages, 7 figures, 1 table. Camera Ready for IROS2024
Data-Driven System Identification of Quadrotors Subject to Motor Delays IROS 2024
Recently non-linear control methods like Model Predictive Control (MPC) and Reinforcement Learning (RL) have attracted increased interest in the quadrotor control community. In contrast to classic control methods like cascaded PID controllers, MPC and RL heavily rely on an accurate model of the system dynamics. The process of quadrotor system identification is notoriously tedious and is often pursued with additional equipment like a thrust stand. Furthermore, low-level details like motor delays which are crucial for accurate end-to-end control are often neglected. In this work, we introduce a data-driven method to identify a quadrotor's inertia parameters, thrust curves, torque coefficients, and first-order motor delay purely based on proprioceptive data. The estimation of the motor delay is particularly challenging as usually, the RPMs can not be measured. We derive a Maximum A Posteriori (MAP)-based method to estimate the latent time constant. Our approach only requires about a minute of flying data that can be collected without any additional equipment and usually consists of three simple maneuvers. Experimental results demonstrate the ability of our method to accurately recover the parameters of multiple quadrotors. It also facilitates the deployment of RL-based, end-to-end quadrotor control of a large quadrotor under harsh, outdoor conditions.
comment: Accepted at IROS 2024
Towards Robust Automation of Surgical Systems via Digital Twin-based Scene Representations from Foundation Models
Large language model-based (LLM) agents are emerging as a powerful enabler of robust embodied intelligence due to their capability of planning complex action sequences. Sound planning ability is necessary for robust automation in many task domains, but especially in surgical automation. These agents rely on a highly detailed natural language representation of the scene. Thus, to leverage the emergent capabilities of LLM agents for surgical task planning, developing similarly powerful and robust perception algorithms is necessary to derive a detailed scene representation of the environment from visual input. Previous research has focused primarily on enabling LLM-based task planning while adopting simple yet severely limited perception solutions to meet the needs for bench-top experiments but lack the critical flexibility to scale to less constrained settings. In this work, we propose an alternate perception approach -- a digital twin-based machine perception approach that capitalizes on the convincing performance and out-of-the-box generalization of recent vision foundation models. Integrating our digital twin-based scene representation and LLM agent for planning with the dVRK platform, we develop an embodied intelligence system and evaluate its robustness in performing peg transfer and gauze retrieval tasks. Our approach shows strong task performance and generalizability to varied environment settings. Despite convincing performance, this work is merely a first step towards the integration of digital twin-based scene representations. Future studies are necessary for the realization of a comprehensive digital twin framework to improve the interpretability and generalizability of embodied intelligence in surgery.
GaRField++: Reinforced Gaussian Radiance Fields for Large-Scale 3D Scene Reconstruction
This paper proposes a novel framework for large-scale scene reconstruction based on 3D Gaussian splatting (3DGS) and aims to address the scalability and accuracy challenges faced by existing methods. For tackling the scalability issue, we split the large scene into multiple cells, and the candidate point-cloud and camera views of each cell are correlated through a visibility-based camera selection and a progressive point-cloud extension. To reinforce the rendering quality, three highlighted improvements are made in comparison with vanilla 3DGS, which are a strategy of the ray-Gaussian intersection and the novel Gaussians density control for learning efficiency, an appearance decoupling module based on ConvKAN network to solve uneven lighting conditions in large-scale scenes, and a refined final loss with the color loss, the depth distortion loss, and the normal consistency loss. Finally, the seamless stitching procedure is executed to merge the individual Gaussian radiance field for novel view synthesis across different cells. Evaluation of Mill19, Urban3D, and MatrixCity datasets shows that our method consistently generates more high-fidelity rendering results than state-of-the-art methods of large-scale scene reconstruction. We further validate the generalizability of the proposed approach by rendering on self-collected video clips recorded by a commercial drone.
A global approach for the redefinition of higher-order flexibility and rigidity
The famous example of the double-Watt mechanism given by Connelly and Servatius raises some problems concerning the classical definitions of higher-order flexibility and rigidity, respectively, as they attest the cusp configuration of the mechanism a third-order rigidity, which conflicts with its continuous flexion. Some attempts were done to resolve the dilemma but they could not settle the problem. As cusp mechanisms demonstrate the basic shortcoming of any local mobility analysis using higher-order constraints, we present a global approach inspired by Sabitov's finite algorithm for testing the bendability of a polyhedron, which allows us (a) to compute iteratively configurations with a higher-order flexion and (b) to come up with a proper redefinition of higher-order flexibility and rigidity. The presented approach is demonstrated on several examples (double-Watt mechanisms and Tarnai's Leonardo structure). Moreover, we determine all configurations of a given 3-RPR manipulator with a third-order flexion and present a corresponding joint-bar framework of flexion order 23.
comment: 26 pages, 12 figures, 9 examples
A Distributed Approach to Autonomous Intersection Management via Multi-Agent Reinforcement Learning
Autonomous intersection management (AIM) poses significant challenges due to the intricate nature of real-world traffic scenarios and the need for a highly expensive centralised server in charge of simultaneously controlling all the vehicles. This study addresses such issues by proposing a novel distributed approach to AIM utilizing multi-agent reinforcement learning (MARL). We show that by leveraging the 3D surround view technology for advanced assistance systems, autonomous vehicles can accurately navigate intersection scenarios without needing any centralised controller. The contributions of this paper thus include a MARL-based algorithm for the autonomous management of a 4-way intersection and also the introduction of a new strategy called prioritised scenario replay for improved training efficacy. We validate our approach as an innovative alternative to conventional centralised AIM techniques, ensuring the full reproducibility of our results. Specifically, experiments conducted in virtual environments using the SMARTS platform highlight its superiority over benchmarks across various metrics.
comment: 15 pages, 2 figures, submitted to Agents in Traffic and Transportation (ATT2024). Status: accepted
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor control and instruction comprehension through end-to-end learning processes. However, current VLA models face significant challenges: they are slow during inference and require extensive pre-training on large amounts of robotic data, making real-world deployment difficult. In this paper, we introduce a new family of compact vision-language-action models, called TinyVLA, which offers two key advantages over existing VLA models: (1) faster inference speeds, and (2) improved data efficiency, eliminating the need for pre-training stage. Our framework incorporates two essential components to build TinyVLA: (1) initializing the policy backbone with robust, high-speed multimodal models, and (2) integrating a diffusion policy decoder during fine-tuning to enable precise robot actions. We conducted extensive evaluations of TinyVLA in both simulation and on real robots, demonstrating that our approach significantly outperforms the state-of-the-art VLA model, OpenVLA, in terms of speed and data efficiency, while delivering comparable or superior performance. Additionally, TinyVLA exhibits strong generalization capabilities across various dimensions, including language instructions, novel objects, unseen positions, changes in object appearance, background variations, and environmental shifts, often matching or exceeding the performance of OpenVLA. We believe that \methodname offers an interesting perspective on utilizing pre-trained multimodal models for policy learning. Our project is at https://tiny-vla.github.io.
Toward Unified Practices in Trajectory Prediction Research on Drone Datasets
The availability of high-quality datasets is crucial for the development of behavior prediction algorithms in autonomous vehicles. This paper highlights the need to standardize the use of certain datasets for motion forecasting research to simplify comparative analysis and proposes a set of tools and practices to achieve this. Drawing on extensive experience and a comprehensive review of current literature, we summarize our proposals for preprocessing, visualization, and evaluation in the form of an open-sourced toolbox designed for researchers working on trajectory prediction problems. The clear specification of necessary preprocessing steps and evaluation metrics is intended to alleviate development efforts and facilitate the comparison of results across different studies. The toolbox is available at: https://github.com/westny/dronalize.
comment: https://github.com/westny/dronalize
USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions
Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)-AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework's feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. To accelerate relevant research in this field, we have made the simulation code available as open-source.
Flow to Rare Events: An Application of Normalizing Flow in Temporal Importance Sampling for Automated Vehicle Validation
Automated Vehicle (AV) validation based on simulated testing requires unbiased evaluation and high efficiency. One effective solution is to increase the exposure to risky rare events while reweighting the probability measure. However, characterizing the distribution of risky events is particularly challenging due to the paucity of samples and the temporality of continuous scenario variables. To solve it, we devise a method to represent, generate, and reweight the distribution of risky rare events. We decompose the temporal evolution of continuous variables into distribution components based on conditional probability. By introducing the Risk Indicator Function, the distribution of risky rare events is theoretically precipitated out of naturalistic driving distribution. This targeted distribution is practically generated via Normalizing Flow, which achieves exact and tractable probability evaluation of intricate distribution. The rare event distribution is then demonstrated as the advantageous Importance Sampling distribution. We also promote the technique of temporal Importance Sampling. The combined method, named as TrimFlow, is executed to estimate the collision rate of Car-following scenarios as a tentative practice. The results showed that sampling background vehicle maneuvers from rare event distribution could evolve testing scenarios to hazardous states. TrimFlow reduced 86.1% of tests compared to generating testing scenarios according to their exposure in the naturalistic driving environment. In addition, the TrimFlow method is not limited to one specific type of functional scenario.
Adaptive Motion Planning for Multi-fingered Functional Grasp via Force Feedback
Enabling multi-fingered robots to grasp and manipulate objects with human-like dexterity is especially challenging during the dynamic, continuous hand-object interactions. Closed-loop feedback control is essential for dexterous hands to dynamically finetune hand poses when performing precise functional grasps. This work proposes an adaptive motion planning method based on deep reinforcement learning to adjust grasping poses according to real-time feedback from joint torques from pre-grasp to goal grasp. We find the multi-joint torques of the dexterous hand can sense object positions through contacts and collisions, enabling real-time adjustment of grasps to generate varying grasping trajectories for objects in different positions. In our experiments, the performance gap with and without force feedback reveals the important role of force feedback in adaptive manipulation. Our approach utilizing force feedback preliminarily exhibits human-like flexibility, adaptability, and precision.
comment: 8 pages,7 figures
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding ECCV 2024
3D vision-language grounding, which focuses on aligning language with the 3D physical environment, stands as a cornerstone in the development of embodied agents. In comparison to recent advancements in the 2D domain, grounding language in 3D scenes faces several significant challenges: (i) the inherent complexity of 3D scenes due to the diverse object configurations, their rich attributes, and intricate relationships; (ii) the scarcity of paired 3D vision-language data to support grounded learning; and (iii) the absence of a unified learning framework to distill knowledge from grounded 3D data. In this work, we aim to address these three major challenges in 3D vision-language by examining the potential of systematically upscaling 3D vision-language learning in indoor environments. We introduce the first million-scale 3D vision-language dataset, SceneVerse, encompassing about 68K 3D indoor scenes and comprising 2.5M vision-language pairs derived from both human annotations and our scalable scene-graph-based generation approach. We demonstrate that this scaling allows for a unified pre-training framework, Grounded Pre-training for Scenes (GPS), for 3D vision-language learning. Through extensive experiments, we showcase the effectiveness of GPS by achieving state-of-the-art performance on all existing 3D visual grounding benchmarks. The vast potential of SceneVerse and GPS is unveiled through zero-shot transfer experiments in the challenging 3D vision-language tasks. Project website: https://scene-verse.github.io.
comment: ECCV 2024
Will Large Language Models be a Panacea to Autonomous Driving?
Artificial intelligence (AI) plays a crucial role in autonomous driving (AD) research, propelling its development towards intelligence and efficiency. Currently, the development of AD technology follows two main technical paths: modularization and end-to-end. Modularization decompose the driving task into modules such as perception, prediction, planning, and control, and train them separately. Due to the inconsistency of training objectives between modules, the integrated effect suffers from bias. End-to-end attempts to address this issue by utilizing a single model that directly maps from sensor data to control signals. This path has limited learning capabilities in a comprehensive set of features and struggles to handle unpredictable long-tail events and complex urban traffic scenarios. In the face of challenges encountered in both paths, many researchers believe that large language models (LLMs) with powerful reasoning capabilities and extensive knowledge understanding may be the solution, expecting LLMs to provide AD systems with deeper levels of understanding and decision-making capabilities. In light of the challenges faced by both paths, many researchers believe that LLMs, with their powerful reasoning abilities and extensive knowledge, could offer a solution. To understand if LLMs could enhance AD, this paper conducts a thorough analysis of the potential applications of LLMs in AD systems, including exploring their optimization strategies in both modular and end-to-end approaches, with a particular focus on how LLMs can tackle the problems and challenges present in current solutions. Furthermore, we discuss an important question: Can LLM-based artificial general intelligence (AGI) be a key to achieve high-level AD? We further analyze the potential limitations and challenges that LLMs may encounter in promoting the development of AD technology.
GUARD: A Safe Reinforcement Learning Benchmark
Due to the trial-and-error nature, it is typically challenging to apply RL algorithms to safety-critical real-world applications, such as autonomous driving, human-robot interaction, robot manipulation, etc, where such errors are not tolerable. Recently, safe RL (i.e. constrained RL) has emerged rapidly in the literature, in which the agents explore the environment while satisfying constraints. Due to the diversity of algorithms and tasks, it remains difficult to compare existing safe RL algorithms. To fill that gap, we introduce GUARD, a Generalized Unified SAfe Reinforcement Learning Development Benchmark. GUARD has several advantages compared to existing benchmarks. First, GUARD is a generalized benchmark with a wide variety of RL agents, tasks, and safety constraint specifications. Second, GUARD comprehensively covers state-of-the-art safe RL algorithms with self-contained implementations. Third, GUARD is highly customizable in tasks and algorithms. We present a comparison of state-of-the-art safe RL algorithms in various task settings using GUARD and establish baselines that future work can build on.
comment: Published in Transaction of Machine Learning Research
Emergent Strategies for Shepherding a Flock
We investigate how a shepherd should move to effectively herd a flock towards a target. Using an agent-based (ABM) and a coarse-grained (ODE) model for the flock, we pose and solve for the optimal strategy of a shepherd that must keep the flock cohesive and coerce it towards a target. Three distinct strategies emerge naturally as a function of the scaled herd size {and} the scaled shepherd speed: (i) mustering, where the shepherd circles the herd to ensure compactness, (ii) droving, where the shepherd chases the herd in a desired direction while sweeping back and forth, and (iii) driving, where the flock surrounds a shepherd that drives it from within. A minimal dynamical model for the size, shape, and position of the herd captures the effective behavior of the ABM and further allows us to characterize the different herding strategies in terms of the behavior of the shepherd that librates (mustering), oscillates (droving), or moves steadily (driving).
comment: Now includes extended work on the effect of inertia on the system along with path-following dynamics
CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration
We introduce Confidence-Controlled Exploration (CCE), a novel exploration scheme designed to enhance the training sample efficiency of reinforcement learning (RL) algorithms for sparse reward settings such as robot navigation. Sparse rewards are common in RL and convenient to design and implement, but typically hard to deal with due to the challenges of exploration. Existing methods deploy regularization-based methods to deal with the exploration challenges. However, it is hard to characterize the balance between exploration and exploitation because regularization modifies the reward function itself, hence changing the objective we are optimizing for. In contrast to regularization-based approaches in the existing literature, our approach, CCE, is based on a novel relationship we provide between gradient estimation and policy entropy. CCE dynamically adjusts the number of samples of the gradient update used during training to control exploration. Interestingly, CCE can be applied to both existing on-policy and off-policy RL methods, which we demonstrate by empirically validating its efficacy on three popular RL methods: REINFORCE, Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC) for goal-reaching robotic navigation tasks. We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization when constraining the sample budget. For a fixed sample budget, CCE achieves an 18\% increase in navigation success rate, a 20-38\% reduction in navigation path length, and a 9.32\% decrease in elevation costs. Furthermore, we showcase the versatility of CCE by integrating it with the Clearpath Husky robot, illustrating its applicability in complex outdoor environments.
comment: 11 pages, 9 figures, 2 tables
Long-Tailed 3D Detection via Multi-Modal Fusion
Contemporary autonomous vehicle (AV) benchmarks have advanced techniques for training 3D detectors, particularly on large-scale multi-modal (LiDAR + RGB) data. Surprisingly, although semantic class labels naturally follow a long-tailed distribution, existing benchmarks only focus on a few common classes (e.g., pedestrian and car) and neglect many rare but crucial classes (e.g., emergency vehicle and stroller). However, AVs must reliably detect both common and rare classes for safe operation in the open world. We address this challenge by formally studying the problem of Long-Tailed 3D Detection (LT3D), which evaluates all annotated classes, including those in-the-tail. We address LT3D with hierarchical losses that promote feature sharing across classes, and introduce diagnostic metrics that award partial credit to ``reasonable'' mistakes with respect to the semantic hierarchy (e.g., mistaking a child for an adult). Further, we point out that rare-class accuracy is particularly improved via multi-modal late fusion (MMLF) of independently trained uni-modal LiDAR and RGB detectors. Importantly, such an MMLF framework allows us to leverage large-scale uni-modal datasets (with more examples for rare classes) to train better uni-modal detectors, unlike prevailing end-to-end trained multi-modal detectors that require paired multi-modal data. Finally, we examine three critical components of our simple MMLF approach from first principles and investigate whether to train 2D or 3D RGB detectors for fusion, whether to match RGB and LiDAR detections in 3D or the projected 2D image plane, and how to fuse matched detections. Our proposed MMLF approach significantly improves LT3D performance over prior work, particularly improving rare class performance from 12.8 to 20.0 mAP!
comment: The first two authors contributed equally. Project page: https://mayechi.github.io/lt3d-lf-io/
The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Accurate depth estimation under out-of-distribution (OoD) scenarios, such as adverse weather conditions, sensor failure, and noise contamination, is desirable for safety-critical applications. Existing depth estimation systems, however, suffer inevitably from real-world corruptions and perturbations and are struggled to provide reliable depth predictions under such cases. In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation. This challenge was developed based on the newly established KITTI-C and NYUDepth2-C benchmarks. We hosted two stand-alone tracks, with an emphasis on robust self-supervised and robust fully-supervised depth estimation, respectively. Out of more than two hundred participants, nine unique and top-performing solutions have appeared, with novel designs ranging from the following aspects: spatial- and frequency-domain augmentations, masked image modeling, image restoration and super-resolution, adversarial training, diffusion-based noise suppression, vision-language pre-training, learned model ensembling, and hierarchical feature enhancement. Extensive experimental analyses along with insightful observations are drawn to better understand the rationale behind each design. We hope this challenge could lay a solid foundation for future research on robust and reliable depth estimation and beyond. The datasets, competition toolkit, workshop recordings, and source code from the winning teams are publicly available on the challenge website.
comment: Technical Report; 65 pages, 34 figures, 24 tables; Code at https://github.com/ldkong1205/RoboDepth
Active Shadowing (ASD): Manipulating Visual Perception of Robotics Behaviors via Implicit Communication
Explicit communication is often valued for its directness during interaction. Implicit communication, on the other hand, is indirect in that its communicative content must be inferred. Implicit communication is considered more desirable in teaming situations that requires reduced interruptions for improved fluency. In this paper, we investigate another unique advantage of implicit communication: its ability to manipulate the perception of object or behavior of interest. When communication results in the perception of an object or behavior to deviate from other information (about the object or behavior) available via observation, it introduces a discrepancy between perception and observation. We show that such a discrepancy in visual perception can benefit human-robot interaction in a controlled manner and introduce an approach referred to as active shadowing (ASD). Through user studies, we demonstrate the effectiveness of active shadowing in creating a misaligned perception of the robot's behavior and its execution in the real-world, resulting in more efficient task completion without sacrificing its understandability. We also analyze conditions under which such visual manipulation is effective.
Knowledge-based Neural Ordinary Differential Equations for Cosserat Rod-based Soft Robots
Soft robots have many advantages over rigid robots thanks to their compliant and passive nature. However, it is generally challenging to model the dynamics of soft robots due to their high spatial dimensionality, making it difficult to use model-based methods to accurately control soft robots. It often requires direct numerical simulation of partial differential equations to simulate soft robots. This not only requires an accurate numerical model, but also makes soft robot modeling slow and expensive. Deep learning algorithms have shown promises in data-driven modeling of soft robots. However, these algorithms usually require a large amount of data, which are difficult to obtain in either simulation or real-world experiments of soft robots. In this work, we propose KNODE-Cosserat, a framework that combines first-principle physics models and neural ordinary differential equations. We leverage the best from both worlds -- the generalization ability of physics-based models and the fast speed of deep learning methods. We validate our framework in both simulation and real-world experiments. In both cases, we show that the robot model significantly improves over the baseline models under different metrics.
comment: 8 pages, 11 figures, 4 tables
Can I Pet Your Robot? Incorporating Capacitive Touch Sensing into a Soft Socially Assistive Robot Platform
This work presents a method of incorporating low-cost capacitive tactile sensors on a soft socially assistive robot platform. By embedding conductive thread into the robot's crocheted exterior, we formed a set of low-cost, flexible capacitive tactile sensors that do not disrupt the robot's soft, zoomorphic embodiment. We evaluated the sensors' performance through a user study (N=20) and found that the sensors reliably detected user touch events and localized touch inputs to one of three regions on the robot's exterior.
comment: Accepted as a Work-In-Progress submission at the 2024 IEEE Haptics Symposium
Learning Object Compliance via Young's Modulus from Single Grasps with Camera-Based Tactile Sensors
Compliance is a useful parametrization of tactile information that humans often utilize in manipulation tasks. It can be used to inform low-level contact-rich actions or characterize objects at a high-level. In robotic manipulation, existing approaches to estimate compliance have struggled to generalize across object shape and material. Using camera-based tactile sensors, we present a novel approach to parametrize compliance through Young's modulus E. We evaluate our method over a novel dataset of 285 common objects, including a wide array of shapes and materials with Young's moduli ranging from 5.0 kPa to 250 GPa. Data is collected over automated parallel grasps of each object. Combining analytical and data-driven approaches, we develop a hybrid system using a multi-tower neural network to analyze a sequence of tactile images from grasping. This system is shown to estimate the Young's modulus of unseen objects within an order of magnitude at 74.2% accuracy across our dataset. This is a drastic improvement over a purely analytical baseline, which exhibits only 28.9% accuracy. Importantly, this estimation system performs irrespective of object geometry and demonstrates robustness across object materials. Thus, it could be applied in a general robotic manipulation setting to characterize unknown objects and inform decision-making, for instance to sort produce by ripeness.
Systems and Control (CS)
Age of Gossip in Networks with Multiple Views of a Source
We consider the version age of information (AoI) in a network where a subset of nodes act as sensing nodes, sampling a source that in general can follow a continuous distribution. Any sample of the source constitutes a new version of the information and the version age of the information is defined with respect to the most recent version of the information available for the whole network. We derive a recursive expression for the average version AoI between different subsets of the nodes which can be used to evaluate the average version AoI for any subset of the nodes including any single node. We derive asymptotic behavior of the average AoI on any single node of the network for various topologies including line, ring, and fully connected networks. The prior art result on version age of a network by Yates [ISIT'21] can be interpreted as in our derivation as a network with a single view of the source, e.g., through a Poisson process with rate $\lambda_{00}$. Our result indicates that there is no loss in the average version AoI performance by replacing a single view of the source with distributed sensing across multiple nodes by splitting the same rate $\lambda_{00}$. Particularly, we show that asymptotically, the average AoI scales with $O(\log(n))$ and $O(\sqrt{n})$ for fully connected and ring networks, respectively. More interestingly, we show that for the ring network the same $O(\sqrt{n})$ asymptotical performance on average AoI is still achieved with distributed sensing if the number of sensing nodes only scales with $O(\sqrt{n})$ instead of prior known result which requires $O(n)$. Our results indicate that the sensing nodes can be arbitrarily chosen as long as the maximum number of consecutive non-sensing nodes also scales as $O(\sqrt{n})$.
A Critical Review of Safe Reinforcement Learning Techniques in Smart Grid Applications
The high penetration of distributed energy resources (DERs) in modern smart power systems introduces unforeseen uncertainties for the electricity sector, leading to increased complexity and difficulty in the operation and control of power systems. As a cutting-edge machine learning technology, deep reinforcement learning (DRL) has been widely implemented in recent years to handle the uncertainty in power systems. However, in critical infrastructures such as power systems, safety issues always receive top priority, while DRL may not always meet the safety requirements of power system operators. The concept of safe reinforcement learning (safe RL) is emerging as a potential solution to overcome the shortcomings of conventional DRL in the operation and control of power systems. This study provides a rigorous review of the latest research efforts focused on safe RL to derive power system control policies while accounting for the unique safety requirements of power grids. Furthermore, this study highlights various safe RL algorithms applied in diverse applications within the power system sector, from single grid-connected power converters, residential smart homes, and buildings to large power distribution networks. For all methods outlined, a discussion on their bottlenecks, research challenges, and potential opportunities in the operation and control of power system applications is also presented. This review aims to support research in the area of safe RL algorithms, embracing smart power system operation with safety constraints amid high uncertainty from DERs.
comment: 16 pages, 7 figures, 9 tables
TE-PINN: Quaternion-Based Orientation Estimation using Transformer-Enhanced Physics-Informed Neural Networks
This paper introduces a Transformer-Enhanced Physics-Informed Neural Network (TE-PINN) designed for accurate quaternion-based orientation estimation in high-dynamic environments, particularly within the field of robotics. By integrating transformer networks with physics-informed learning, our approach innovatively captures temporal dependencies in sensor data while enforcing the fundamental physical laws governing rotational motion. TE-PINN leverages a multi-head attention mechanism to handle sequential data from inertial sensors, such as accelerometers and gyroscopes, ensuring temporal consistency. Simultaneously, the model embeds quaternion kinematics and rigid body dynamics into the learning process, aligning the network's predictions with mechanical principles like Euler's laws of motion. The physics-informed loss function incorporates the dynamics of angular velocity and external forces, enhancing the network's ability to generalize in complex scenarios. Our experimental evaluation demonstrates that TE-PINN consistently outperforms traditional methods such as Extended Kalman Filters (EKF) and LSTM-based estimators, particularly in scenarios characterized by high angular velocities and noisy sensor data. The results show a significant reduction in mean quaternion error and improved gyroscope bias estimation compared to the state-of-the-art. An ablation study further isolates the contributions of both the transformer architecture and the physics-informed constraints, highlighting the synergistic effect of both components in improving model performance. The proposed model achieves real-time performance on embedded systems typical of mobile robots, offering a scalable and efficient solution for orientation estimation in autonomous systems.
System-Level Performance Metrics Sensitivity of an Electrified Heavy-Duty Mobile Manipulator
The shift to electric and hybrid powertrains in vehicular systems has propelled advancements in mobile robotics and autonomous vehicles. This paper examines the sensitivity of key performance metrics in a electrified heavy-duty mobile manipulator (HDMM) driven by electromechanical linear actuators (EMLAs) powered by permanent magnet synchronous motors (PMSMs). The study evaluates power delivery, force dynamics, energy consumption, and overall efficiency of the actuation mechanisms. By computing partial derivatives (PD) with respect to the payload mass at the tool center point (TCP), it provides insights into these factors under various loading conditions. This research aids in the appropriate choice or design of EMLAs for HDMM electrification, addressing actuation mechanism selection challenge in vehicular system with mounted manipulator and determines the necessary battery capacity requirements.
comment: This work is submitted to IEEE VTC 2024
Mean Age of Information in Partial Offloading Mobile Edge Computing Networks
The age of information (AoI) performance analysis is essential for evaluating the information freshness in the large-scale mobile edge computing (MEC) networks. This work proposes the earliest analysis of the mean AoI (MAoI) performance of large-scale partial offloading MEC networks. Firstly, we derive and validate the closed-form expressions of MAoI by using queueing theory and stochastic geometry. Based on these expressions, we analyse the effects of computing offloading ratio (COR) and task generation rate (TGR) on the MAoI performance and compare the MAoI performance under the local computing, remote computing, and partial offloading schemes. The results show that by jointly optimising the COR and TGR, the partial offloading scheme outperforms the local and remote computing schemes in terms of the MAoI, which can be improved by up to 51% and 61%, respectively. This encourages the MEC networks to adopt the partial offloading scheme to improve the MAoI performance.
Wind lulls and slews; consequences for the stability of future UK electricity systems
As the United Kingdom wind fleet increases in size, wind lulls and slews will increasingly challenge the stability of its electricity system. The paper describes the use of models based on real time records and including solar slews, to investigate the most extreme wind variations likely to be encountered in future, enabling strategies to be devised to mitigate them. Wind lulls are surprisingly frequent, occasionally lasting a week or more, and are always likely to be beyond the capabilities of stored or imported electrical energy to mitigate them. The models indicate that there will be a continuing need for gas powered generation to mitigate wind lulls. Currently, Combined Cycle Gas Turbines (CCGTs) provide most of the dispatchable generation. However, CCGTs are not sufficiently fast acting to cope with the wind and solar slews anticipated in future. The paper suggests that a range of already proven fast-acting sources of dispatchable generation, including Open Cycle Gas Turbines (OCGTs), Internal Combustion Gas-Fired Reciprocating engines (ICGRs) and stored electrical energy systems, should be capable of coping with the largest wind and solar slews likely to be encountered up to the year 2035. Examples are given of the recent introduction of these fast-acting sources of generation which, it is suggested, will progressively replace CCGTs as the wind and solar fleets increase in size. Moreover, we see the pattern of recent investments, summarised in the paper, as a good indication of likely future investments, with OCGT investments mainly serving the 440 kV grid, and ICGRs and stored electrical energy more local networks.
comment: 13 pages, 8 figures, 3 tables
Assessing strategies to manage distributed photovoltaics in Swiss low-voltage networks: An analysis of curtailment, export tariffs, and resource sharing
The integration of photovoltaic systems poses several challenges for the distribution grid, mainly due to the infrastructure not being designed to handle the upstream flow and being dimensioned for consumption only, potentially leading to reliability and stability issues. This study investigates the use of capacity-based tariffs, export tariffs, and curtailment policies to reduce negative grid impacts without hampering PV deployment. We analyze the effect of such export tariffs on three typical Swiss low-voltage networks (rural, semi-urban, and urban), using power flow analysis to evaluate the power exchanges at the transformer station, as well as line overloading and voltage violations. Finally, a simple case of mutualization of resources is analyzed to assess its potential contribution to relieving network constraints and the economic costs of managing LV networks. We found that the tariff with capacity-based components on the export (CT export daily) severely penalizes PV penetration. This applies to other tariffs as well (e.g. IRR monthly, Curtailment 30, and DT variable) but to a lesser extent. However, the inclusion of curtailment at 50\% and 70\%, as well as mixed tariffs with capacity-based components at import and curtailment, allow for a high degree of PV installations in the three zones studied and help to mitigate the impact of PV on the distributed network.
comment: Preprint version. 25 pages, 6 figures
Whole-body end-effector pose tracking
Combining manipulation with the mobility of legged robots is essential for a wide range of robotic applications. However, integrating an arm with a mobile base significantly increases the system's complexity, making precise end-effector control challenging. Existing model-based approaches are often constrained by their modeling assumptions, leading to limited robustness. Meanwhile, recent Reinforcement Learning (RL) implementations restrict the arm's workspace to be in front of the robot or track only the position to obtain decent tracking accuracy. In this work, we address these limitations by introducing a whole-body RL formulation for end-effector pose tracking in a large workspace on rough, unstructured terrains. Our proposed method involves a terrain-aware sampling strategy for the robot's initial configuration and end-effector pose commands, as well as a game-based curriculum to extend the robot's operating range. We validate our approach on the ANYmal quadrupedal robot with a six DoF robotic arm. Through our experiments, we show that the learned controller achieves precise command tracking over a large workspace and adapts across varying terrains such as stairs and slopes. On deployment, it achieves a pose-tracking error of 2.64 cm and 3.64 degrees, outperforming existing competitive baselines.
Safe Output Feedback Improvement with Baselines
In data-driven control design, an important problem is to deal with uncertainty due to limited and noisy data. One way to do this is to use a min-max approach, which aims to minimize some design criteria for the worst-case scenario. However, a strategy based on this approach can lead to overly conservative controllers. To overcome this issue, we apply the idea of baseline regret, and it is seen that minimizing the baseline regret under model uncertainty can guarantee safe controller improvement with less conservatism and variance in the resulting controllers. To exemplify the use of baseline controllers, we focus on the output feedback setting and propose a two-step control design method; first, an uncertainty set is constructed by a data-driven system identification approach based on finite impulse response models; then a control design criterion based on model reference control is used. To solve the baseline regret optimization problem efficiently, we use a convex approximation of the criterion and apply the scenario approach in optimization. The numerical examples show that the inclusion of baseline regret indeed improves the performance and reduces the variance of the resulting controller.
comment: Accepted by The 63rd IEEE Conference on Decision and Control
Robust Neural IDA-PBC: passivity-based stabilization under approximations
In this paper, we restructure the Neural Interconnection and Damping Assignment - Passivity Based Control (Neural IDA-PBC) design methodology, and we formally analyze its closed-loop properties. Neural IDA-PBC redefines the IDA-PBC design approach as an optimization problem by building on the framework of Physics Informed Neural Networks (PINNs). However, the closed-loop stability and robustness properties under Neural IDA-PBC remain unexplored. To address the issue, we study the behavior of classical IDA-PBC under approximations. Our theoretical analysis allows deriving conditions for practical and asymptotic stability of the desired equilibrium point. Moreover, it extends the Neural IDA-PBC applicability to port-Hamiltonian systems where the matching conditions cannot be solved exactly. Our renewed optimization-based design introduces three significant aspects: i) it involves a novel optimization objective including stability and robustness constraints issued from our theoretical analysis; ii) it employs separate Neural Networks (NNs), which can be structured to reduce the search space to relevant functions; iii) it does not require knowledge about the port-Hamiltonian formulation of the system's model. Our methodology is validated with simulations on three standard benchmarks: a double pendulum, a nonlinear mass-spring-damper and a cartpole. Notably, classical IDA-PBC designs cannot be analytically derived for the latter.
comment: Preprint
Identification For Control Based on Neural Networks: Approximately Linearizable Models
This work presents a control-oriented identification scheme for efficient control design and stability analysis of nonlinear systems. Neural networks are used to identify a discrete-time nonlinear state-space model to approximate time-domain input-output behavior of a nonlinear system. The network is constructed such that the identified model is approximately linearizable by feedback, ensuring that the control law trivially follows from the learning stage. After the identification and quasi-linearization procedures, linear control theory comes at hand to design robust controllers and study stability of the closed-loop system. The effectiveness and interest of the methodology are illustrated throughout the paper on popular benchmarks for system identification.
comment: 15 pages, 3 figures, 6 tables, accepted as a poster in SysDO 2024, Stuttgart, Germany
Diffusion Models for Intelligent Transportation Systems: A Survey
Intelligent Transportation Systems (ITS) are vital in modern traffic management and optimization, significantly enhancing traffic efficiency and safety. Recently, diffusion models have emerged as transformative tools for addressing complex challenges within ITS. In this paper, we present a comprehensive survey of diffusion models for ITS, covering both theoretical and practical aspects. First, we introduce the theoretical foundations of diffusion models and their key variants, including conditional diffusion models and latent diffusion models, highlighting their suitability for modeling complex, multi-modal traffic data and enabling controllable generation. Second, we outline the primary challenges in ITS and the corresponding advantages of diffusion models, providing readers with a deeper understanding of the intersection between ITS and diffusion models. Third, we offer a multi-perspective investigation of current applications of diffusion models in ITS domains, including autonomous driving, traffic simulation, trajectory prediction, and traffic safety. Finally, we discuss state-of-the-art diffusion model techniques and highlight key ITS research directions that warrant further investigation. Through this structured overview, we aim to provide researchers with a comprehensive understanding of diffusion models for ITS, thereby advancing their future applications in the transportation domain.
comment: 7 figures
A Multi-Level Approach for Class Imbalance Problem in Federated Learning for Remote Industry 4.0 Applications
Deep neural network (DNN) models are effective solutions for industry 4.0 applications (\eg oil spill detection, fire detection, anomaly detection). However, training a DNN network model needs a considerable amount of data collected from various sources and transferred to the central cloud server that can be expensive and sensitive to privacy. For instance, in the remote offshore oil field where network connectivity is vulnerable, a federated fog environment can be a potential computing platform. Hence it is feasible to perform computation within the federation. On the contrary, performing a DNN model training using fog systems poses a security issue that the federated learning (FL) technique can resolve. In this case, the new challenge is the class imbalance problem that can be inherited in local data sets and can degrade the performance of the global model. Therefore, FL training needs to be performed considering the class imbalance problem locally. In addition, an efficient technique to select the relevant worker model needs to be adopted at the global level to increase the robustness of the global model. Accordingly, we utilize one of the suitable loss functions addressing the class imbalance in workers at the local level. In addition, we employ a dynamic threshold mechanism with user-defined worker's weight to efficiently select workers for aggregation that improve the global model's robustness. Finally, we perform an extensive empirical evaluation to explore the benefits of our solution and find up to 3-5% performance improvement than baseline federated learning methods.
Regional stability conditions for recurrent neural network-based control systems
In this paper we propose novel global and regional stability analysis conditions based on linear matrix inequalities for a general class of recurrent neural networks. These conditions can be also used for state-feedback control design and a suitable optimization problem enforcing H2 norm minimization properties is defined. The theoretical results are corroborated by numerical simulations, showing the advantages and limitations of the methods presented herein.
Reinforcement Leaning for Infinite-Dimensional Systems
Interest in reinforcement learning (RL) for massive-scale systems consisting of large populations of intelligent agents interacting with heterogeneous environments has witnessed a significant surge in recent years across diverse scientific domains. However, due to the large-scale nature of the system, the majority of state-of-the-art RL techniques either encounter high computational cost or exhibit compromised performance. To mitigate these challenges, we propose a novel RL architecture along with the derivation of effective algorithms to learn optimal policies for any arbitrarily large system of agents. Specifically, we model such a system as a parameterized control system defined on an infinite-dimensional function space. We then develop a moment kernel transform to map the parameterized system and the value function of an RL problem into a reproducing kernel Hilbert space. This transformation subsequently generates a finite-dimensional moment representation for this RL problem. Leveraging this representation, we develop a hierarchical algorithm for learning optimal policies for the infinite-dimensional parameterized system. We further enhance efficiency of the algorithm by exploiting early stopping at each hierarchy, by which we show the fast convergence property of the algorithm through constructing a convergent spectral sequence. The performance and efficiency of the proposed algorithm are validated using practical examples.
Optimization of partially isolated quantum harmonic oscillator memory systems by mean square decoherence time criteria
This paper is concerned with open quantum harmonic oscillators with position-momentum system variables, whose internal dynamics and interaction with the environment are governed by linear quantum stochastic differential equations. A recently proposed approach to such systems as Heisenberg picture quantum memories exploits their ability to approximately retain initial conditions over a decoherence horizon. Using the quantum memory decoherence time defined previously in terms of a fidelity threshold on a weighted mean-square deviation of the system variables from their initial values, we apply this approach to a partially isolated subsystem of the oscillator, which is not directly affected by the external fields. The partial isolation leads to an appropriate system decomposition and a qualitatively different short-horizon asymptotic behaviour of the deviation, which yields a longer decoherence time in the high-fidelity limit. The resulting approximate decoherence time maximization over the energy parameters for improving the quantum memory performance is discussed for a coherent feedback interconnection of such systems.
comment: 9 pages, 3 figures, submitted to ANZCC 2025
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC ICRA
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses. In contrast, the RL actor risked damaging the machine and was unsuitable for real-world use.
comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
Autotuning Bipedal Locomotion MPC with GRFM-Net for Efficient Sim-to-Real Transfer
Bipedal locomotion control is essential for humanoid robots to navigate complex, human-centric environments. While optimization-based control designs are popular for integrating sophisticated models of humanoid robots, they often require labor-intensive manual tuning. In this work, we address the challenges of parameter selection in bipedal locomotion control using DiffTune, a model-based autotuning method that leverages differential programming for efficient parameter learning. A major difficulty lies in balancing model fidelity with differentiability. We address this difficulty using a low-fidelity model for differentiability, enhanced by a Ground Reaction Force-and-Moment Network (GRFM-Net) to capture discrepancies between MPC commands and actual control effects. We validate the parameters learned by DiffTune with GRFM-Net in hardware experiments, which demonstrates the parameters' optimality in a multi-objective setting compared with baseline parameters, reducing the total loss by up to 40.5$\%$ compared with the expert-tuned parameters. The results confirm the GRFM-Net's effectiveness in mitigating the sim-to-real gap, improving the transferability of simulation-learned parameters to real hardware.
Open-/Closed-loop Active Learning for Data-driven Predictive Control
An important question in data-driven control is how to obtain an informative dataset. In this work, we consider the problem of effective data acquisition of an unknown linear system with bounded disturbance for both open-loop and closed-loop stages. The learning objective is to minimize the volume of the set of admissible systems. First, a performance measure based on historical data and the input sequence is introduced to characterize the upper bound of the volume of the set of admissible systems. On the basis of this performance measure, an open-loop active learning strategy is proposed to minimize the volume by actively designing inputs during the open-loop stage. For the closed-loop stage, an closed-loop active learning strategy is designed to select and learn from informative closed-loop data. The efficiency of the proposed closed-loop active learning strategy is proved by showing that the unselected data cannot benefit the learning performance. Furthermore, an adaptive predictive controller is designed in accordance with the proposed data acquisition approach. The recursive feasibility and the stability of the controller are proved by analyzing the effect of the closed-loop active learning strategy. Finally, numerical examples and comparisons illustrate the effectiveness of the proposed data acquisition strategy.
Agent-state based policies in POMDPs: Beyond belief-state MDPs
The traditional approach to POMDPs is to convert them into fully observed MDPs by considering a belief state as an information state. However, a belief-state based approach requires perfect knowledge of the system dynamics and is therefore not applicable in the learning setting where the system model is unknown. Various approaches to circumvent this limitation have been proposed in the literature. We present a unified treatment of some of these approaches by viewing them as models where the agent maintains a local recursively updateable agent state and chooses actions based on the agent state. We highlight the different classes of agent-state based policies and the various approaches that have been proposed in the literature to find good policies within each class. These include the designer's approach to find optimal non-stationary agent-state based policies, policy search approaches to find a locally optimal stationary agent-state based policies, and the approximate information state to find approximately optimal stationary agent-state based policies. We then present how ideas from the approximate information state approach have been used to improve Q-learning and actor-critic algorithms for learning in POMDPs.
Learning Linear Dynamics from Bilinear Observations
We consider the problem of learning a realization of a partially observed dynamical system with linear state transitions and bilinear observations. Under very mild assumptions on the process and measurement noises, we provide a finite time analysis for learning the unknown dynamics matrices (up to a similarity transform). Our analysis involves a regression problem with heavy-tailed and dependent data. Moreover, each row of our design matrix contains a Kronecker product of current input with a history of inputs, making it difficult to guarantee persistence of excitation. We overcome these challenges, first providing a data-dependent high probability error bound for arbitrary but fixed inputs. Then, we derive a data-independent error bound for inputs chosen according to a simple random design. Our main results provide an upper bound on the statistical error rates and sample complexity of learning the unknown dynamics matrices from a single finite trajectory of bilinear observations.
comment: 35 pages, 3 figures
Interaction Techniques for User-friendly Interfaces for Gate-based Quantum Computing
Quantum computers offer promising approaches to various fields. To use current noisy quantum computers, developers need to examine the compilation of a logical circuit, the status of available hardware, and noises in results. As those tasks are less common in classical computing, quantum developers may not be familiar with performing them. Therefore, easier and more intuitive interfaces are necessary to make quantum computers more approachable. While existing notebook-based toolkits like Qiskit offer application programming interfaces and visualization techniques, it is still difficult to navigate the vast space of quantum program design and hardware status. Inspired by human-computer interaction (HCI) work in data science and visualization, our work introduces four user interaction techniques that can augment existing notebook-based toolkits for gate-based quantum computing: (1) a circuit writer that lets users provide high-level information about a circuit and generates a code snippet to build it; (2) a machine explorer that provides detailed properties and configurations of a hardware with a code to load selected information; (3) a circuit viewer that allows for comparing logical circuit, compiled circuit, and hardware configurations; and (4) a visualization for adjusting measurement outcomes with hardware error rates.
comment: A poster accepted to IEEE QCE 2024
MBC: Multi-Brain Collaborative Control for Quadruped Robots
In the field of locomotion task of quadruped robots, Blind Policy and Perceptive Policy each have their own advantages and limitations. The Blind Policy relies on preset sensor information and algorithms, suitable for known and structured environments, but it lacks adaptability in complex or unknown environments. The Perceptive Policy uses visual sensors to obtain detailed environmental information, allowing it to adapt to complex terrains, but its effectiveness is limited under occluded conditions, especially when perception fails. Unlike the Blind Policy, the Perceptive Policy is not as robust under these conditions. To address these challenges, we propose a MBC:Multi-Brain collaborative system that incorporates the concepts of Multi-Agent Reinforcement Learning and introduces collaboration between the Blind Policy and the Perceptive Policy. By applying this multi-policy collaborative model to a quadruped robot, the robot can maintain stable locomotion even when the perceptual system is impaired or observational data is incomplete. Our simulations and real-world experiments demonstrate that this system significantly improves the robot's passability and robustness against perception failures in complex environments, validating the effectiveness of multi-policy collaboration in enhancing robotic motion performance.
comment: 18 pages, 9 figures, Website and Videos: https://quad-mbc.github.io/
Active Perception with Initial-State Uncertainty: A Policy Gradient Method
This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment.
Willems' Fundamental Lemma for Nonlinear Systems with Koopman Linear Embedding
Koopman operator theory and Willems' fundamental lemma both can provide (approximated) data-driven linear representation for nonlinear systems. However, choosing lifting functions for the Koopman operator is challenging, and the quality of the data-driven model from Willems' fundamental lemma has no guarantee for general nonlinear systems. In this paper, we extend Willems' fundamental lemma for a class of nonlinear systems that admit a Koopman linear embedding. We first characterize the relationship between the trajectory space of a nonlinear system and that of its Koopman linear embedding. We then prove that the trajectory space of Koopman linear embedding can be formed by a linear combination of rich-enough trajectories from the nonlinear system. Combining these two results leads to a data-driven representation of the nonlinear system, which bypasses the need for the lifting functions and thus eliminates the associated bias errors. Our results illustrate that both the width (more trajectories) and depth (longer trajectories) of the trajectory library are important to ensure the accuracy of the data-driven model.
Transformer based time series prediction of the maximum power point for solar photovoltaic cells
This paper proposes an improved deep learning based maximum power point tracking (MPPT) in solar photovoltaic cells considering various time series based environmental inputs. Generally, artificial neural network based MPPT algorithms use basic neural network architectures and inputs which do not represent the ambient conditions in a comprehensive manner. In this article, the ambient conditions of a location are represented through a comprehensive set of environmental features. Furthermore, the inclusion of time based features in the input data is considered to model cyclic patterns temporally within the atmospheric conditions leading to robust modeling of the MPPT algorithm. A transformer based deep learning architecture is trained as a time series prediction model using multidimensional time series input features. The model is trained on a dataset containing typical meteorological year data points of ambient weather conditions from 50 locations. The attention mechanism in the transformer modules allows the model to learn temporal patterns in the data efficiently. The proposed model achieves a 0.47% mean average percentage error of prediction on non zero operating voltage points in a test dataset consisting of data collected over a period of 200 consecutive hours resulting in the average power efficiency of 99.54% and peak power efficiency of 99.98%. The proposed model is validated through real time simulations. The proposed model performs power point tracking in a robust, dynamic, and nonlatent manner, over a wide range of atmospheric conditions.
comment: Published June 2022, in Energy Science and Engineering, Volume10, Issue9, Pages 3397-3410
Data-Driven System Identification of Quadrotors Subject to Motor Delays IROS 2024
Recently non-linear control methods like Model Predictive Control (MPC) and Reinforcement Learning (RL) have attracted increased interest in the quadrotor control community. In contrast to classic control methods like cascaded PID controllers, MPC and RL heavily rely on an accurate model of the system dynamics. The process of quadrotor system identification is notoriously tedious and is often pursued with additional equipment like a thrust stand. Furthermore, low-level details like motor delays which are crucial for accurate end-to-end control are often neglected. In this work, we introduce a data-driven method to identify a quadrotor's inertia parameters, thrust curves, torque coefficients, and first-order motor delay purely based on proprioceptive data. The estimation of the motor delay is particularly challenging as usually, the RPMs can not be measured. We derive a Maximum A Posteriori (MAP)-based method to estimate the latent time constant. Our approach only requires about a minute of flying data that can be collected without any additional equipment and usually consists of three simple maneuvers. Experimental results demonstrate the ability of our method to accurately recover the parameters of multiple quadrotors. It also facilitates the deployment of RL-based, end-to-end quadrotor control of a large quadrotor under harsh, outdoor conditions.
comment: Accepted at IROS 2024
A Fairness-Oriented Reinforcement Learning Approach for the Operation and Control of Shared Micromobility Services
As Machine Learning grows in popularity across various fields, equity has become a key focus for the AI community. However fairness-oriented approaches are still underexplored in smart mobility. Addressing this gap, our study investigates the balance between performance optimization and algorithmic fairness in shared micromobility services providing a novel framework based on Reinforcement Learning. Exploiting Q-Learning, the proposed methodology achieves equitable outcomes in terms of the Gini index across different areas characterized by their distance from central hubs. Through vehicle rebalancing, the provided scheme maximizes operator performance while ensuring fairness principles for users, reducing iniquity by up to 80% while only increasing costs by 30% (w.r.t. applying no equity adjustment). A case study with synthetic data validates our insights and highlights the importance of fairness in urban micromobility.
comment: 6 pages, 2 figures, jointly submitted to IEEE L-CSS and ACC 2025
USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions
Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)-AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework's feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. To accelerate relevant research in this field, we have made the simulation code available as open-source.
First Field Trial of LLM-Powered AI Agent for Lifecycle Management of Autonomous Driving Optical Networks
We design and demonstrate the first field trial of LLM-powered AI Agent for ADON. Three operation modes of the Agent are proposed for network lifecycle management. The Agent efficiently processes wavelength add/drop and soft/hard failures, and achieves comparable performance to human-designed algorithms for power optimization.
comment: Version submitted to ECOC PDP 2024 on September 6th
Will Large Language Models be a Panacea to Autonomous Driving?
Artificial intelligence (AI) plays a crucial role in autonomous driving (AD) research, propelling its development towards intelligence and efficiency. Currently, the development of AD technology follows two main technical paths: modularization and end-to-end. Modularization decompose the driving task into modules such as perception, prediction, planning, and control, and train them separately. Due to the inconsistency of training objectives between modules, the integrated effect suffers from bias. End-to-end attempts to address this issue by utilizing a single model that directly maps from sensor data to control signals. This path has limited learning capabilities in a comprehensive set of features and struggles to handle unpredictable long-tail events and complex urban traffic scenarios. In the face of challenges encountered in both paths, many researchers believe that large language models (LLMs) with powerful reasoning capabilities and extensive knowledge understanding may be the solution, expecting LLMs to provide AD systems with deeper levels of understanding and decision-making capabilities. In light of the challenges faced by both paths, many researchers believe that LLMs, with their powerful reasoning abilities and extensive knowledge, could offer a solution. To understand if LLMs could enhance AD, this paper conducts a thorough analysis of the potential applications of LLMs in AD systems, including exploring their optimization strategies in both modular and end-to-end approaches, with a particular focus on how LLMs can tackle the problems and challenges present in current solutions. Furthermore, we discuss an important question: Can LLM-based artificial general intelligence (AGI) be a key to achieve high-level AD? We further analyze the potential limitations and challenges that LLMs may encounter in promoting the development of AD technology.
Systems and Control (EESS)
Age of Gossip in Networks with Multiple Views of a Source
We consider the version age of information (AoI) in a network where a subset of nodes act as sensing nodes, sampling a source that in general can follow a continuous distribution. Any sample of the source constitutes a new version of the information and the version age of the information is defined with respect to the most recent version of the information available for the whole network. We derive a recursive expression for the average version AoI between different subsets of the nodes which can be used to evaluate the average version AoI for any subset of the nodes including any single node. We derive asymptotic behavior of the average AoI on any single node of the network for various topologies including line, ring, and fully connected networks. The prior art result on version age of a network by Yates [ISIT'21] can be interpreted as in our derivation as a network with a single view of the source, e.g., through a Poisson process with rate $\lambda_{00}$. Our result indicates that there is no loss in the average version AoI performance by replacing a single view of the source with distributed sensing across multiple nodes by splitting the same rate $\lambda_{00}$. Particularly, we show that asymptotically, the average AoI scales with $O(\log(n))$ and $O(\sqrt{n})$ for fully connected and ring networks, respectively. More interestingly, we show that for the ring network the same $O(\sqrt{n})$ asymptotical performance on average AoI is still achieved with distributed sensing if the number of sensing nodes only scales with $O(\sqrt{n})$ instead of prior known result which requires $O(n)$. Our results indicate that the sensing nodes can be arbitrarily chosen as long as the maximum number of consecutive non-sensing nodes also scales as $O(\sqrt{n})$.
A Critical Review of Safe Reinforcement Learning Techniques in Smart Grid Applications
The high penetration of distributed energy resources (DERs) in modern smart power systems introduces unforeseen uncertainties for the electricity sector, leading to increased complexity and difficulty in the operation and control of power systems. As a cutting-edge machine learning technology, deep reinforcement learning (DRL) has been widely implemented in recent years to handle the uncertainty in power systems. However, in critical infrastructures such as power systems, safety issues always receive top priority, while DRL may not always meet the safety requirements of power system operators. The concept of safe reinforcement learning (safe RL) is emerging as a potential solution to overcome the shortcomings of conventional DRL in the operation and control of power systems. This study provides a rigorous review of the latest research efforts focused on safe RL to derive power system control policies while accounting for the unique safety requirements of power grids. Furthermore, this study highlights various safe RL algorithms applied in diverse applications within the power system sector, from single grid-connected power converters, residential smart homes, and buildings to large power distribution networks. For all methods outlined, a discussion on their bottlenecks, research challenges, and potential opportunities in the operation and control of power system applications is also presented. This review aims to support research in the area of safe RL algorithms, embracing smart power system operation with safety constraints amid high uncertainty from DERs.
comment: 16 pages, 7 figures, 9 tables
TE-PINN: Quaternion-Based Orientation Estimation using Transformer-Enhanced Physics-Informed Neural Networks
This paper introduces a Transformer-Enhanced Physics-Informed Neural Network (TE-PINN) designed for accurate quaternion-based orientation estimation in high-dynamic environments, particularly within the field of robotics. By integrating transformer networks with physics-informed learning, our approach innovatively captures temporal dependencies in sensor data while enforcing the fundamental physical laws governing rotational motion. TE-PINN leverages a multi-head attention mechanism to handle sequential data from inertial sensors, such as accelerometers and gyroscopes, ensuring temporal consistency. Simultaneously, the model embeds quaternion kinematics and rigid body dynamics into the learning process, aligning the network's predictions with mechanical principles like Euler's laws of motion. The physics-informed loss function incorporates the dynamics of angular velocity and external forces, enhancing the network's ability to generalize in complex scenarios. Our experimental evaluation demonstrates that TE-PINN consistently outperforms traditional methods such as Extended Kalman Filters (EKF) and LSTM-based estimators, particularly in scenarios characterized by high angular velocities and noisy sensor data. The results show a significant reduction in mean quaternion error and improved gyroscope bias estimation compared to the state-of-the-art. An ablation study further isolates the contributions of both the transformer architecture and the physics-informed constraints, highlighting the synergistic effect of both components in improving model performance. The proposed model achieves real-time performance on embedded systems typical of mobile robots, offering a scalable and efficient solution for orientation estimation in autonomous systems.
System-Level Performance Metrics Sensitivity of an Electrified Heavy-Duty Mobile Manipulator
The shift to electric and hybrid powertrains in vehicular systems has propelled advancements in mobile robotics and autonomous vehicles. This paper examines the sensitivity of key performance metrics in a electrified heavy-duty mobile manipulator (HDMM) driven by electromechanical linear actuators (EMLAs) powered by permanent magnet synchronous motors (PMSMs). The study evaluates power delivery, force dynamics, energy consumption, and overall efficiency of the actuation mechanisms. By computing partial derivatives (PD) with respect to the payload mass at the tool center point (TCP), it provides insights into these factors under various loading conditions. This research aids in the appropriate choice or design of EMLAs for HDMM electrification, addressing actuation mechanism selection challenge in vehicular system with mounted manipulator and determines the necessary battery capacity requirements.
comment: This work is submitted to IEEE VTC 2024
Mean Age of Information in Partial Offloading Mobile Edge Computing Networks
The age of information (AoI) performance analysis is essential for evaluating the information freshness in the large-scale mobile edge computing (MEC) networks. This work proposes the earliest analysis of the mean AoI (MAoI) performance of large-scale partial offloading MEC networks. Firstly, we derive and validate the closed-form expressions of MAoI by using queueing theory and stochastic geometry. Based on these expressions, we analyse the effects of computing offloading ratio (COR) and task generation rate (TGR) on the MAoI performance and compare the MAoI performance under the local computing, remote computing, and partial offloading schemes. The results show that by jointly optimising the COR and TGR, the partial offloading scheme outperforms the local and remote computing schemes in terms of the MAoI, which can be improved by up to 51% and 61%, respectively. This encourages the MEC networks to adopt the partial offloading scheme to improve the MAoI performance.
Wind lulls and slews; consequences for the stability of future UK electricity systems
As the United Kingdom wind fleet increases in size, wind lulls and slews will increasingly challenge the stability of its electricity system. The paper describes the use of models based on real time records and including solar slews, to investigate the most extreme wind variations likely to be encountered in future, enabling strategies to be devised to mitigate them. Wind lulls are surprisingly frequent, occasionally lasting a week or more, and are always likely to be beyond the capabilities of stored or imported electrical energy to mitigate them. The models indicate that there will be a continuing need for gas powered generation to mitigate wind lulls. Currently, Combined Cycle Gas Turbines (CCGTs) provide most of the dispatchable generation. However, CCGTs are not sufficiently fast acting to cope with the wind and solar slews anticipated in future. The paper suggests that a range of already proven fast-acting sources of dispatchable generation, including Open Cycle Gas Turbines (OCGTs), Internal Combustion Gas-Fired Reciprocating engines (ICGRs) and stored electrical energy systems, should be capable of coping with the largest wind and solar slews likely to be encountered up to the year 2035. Examples are given of the recent introduction of these fast-acting sources of generation which, it is suggested, will progressively replace CCGTs as the wind and solar fleets increase in size. Moreover, we see the pattern of recent investments, summarised in the paper, as a good indication of likely future investments, with OCGT investments mainly serving the 440 kV grid, and ICGRs and stored electrical energy more local networks.
comment: 13 pages, 8 figures, 3 tables
Assessing strategies to manage distributed photovoltaics in Swiss low-voltage networks: An analysis of curtailment, export tariffs, and resource sharing
The integration of photovoltaic systems poses several challenges for the distribution grid, mainly due to the infrastructure not being designed to handle the upstream flow and being dimensioned for consumption only, potentially leading to reliability and stability issues. This study investigates the use of capacity-based tariffs, export tariffs, and curtailment policies to reduce negative grid impacts without hampering PV deployment. We analyze the effect of such export tariffs on three typical Swiss low-voltage networks (rural, semi-urban, and urban), using power flow analysis to evaluate the power exchanges at the transformer station, as well as line overloading and voltage violations. Finally, a simple case of mutualization of resources is analyzed to assess its potential contribution to relieving network constraints and the economic costs of managing LV networks. We found that the tariff with capacity-based components on the export (CT export daily) severely penalizes PV penetration. This applies to other tariffs as well (e.g. IRR monthly, Curtailment 30, and DT variable) but to a lesser extent. However, the inclusion of curtailment at 50\% and 70\%, as well as mixed tariffs with capacity-based components at import and curtailment, allow for a high degree of PV installations in the three zones studied and help to mitigate the impact of PV on the distributed network.
comment: Preprint version. 25 pages, 6 figures
Whole-body end-effector pose tracking
Combining manipulation with the mobility of legged robots is essential for a wide range of robotic applications. However, integrating an arm with a mobile base significantly increases the system's complexity, making precise end-effector control challenging. Existing model-based approaches are often constrained by their modeling assumptions, leading to limited robustness. Meanwhile, recent Reinforcement Learning (RL) implementations restrict the arm's workspace to be in front of the robot or track only the position to obtain decent tracking accuracy. In this work, we address these limitations by introducing a whole-body RL formulation for end-effector pose tracking in a large workspace on rough, unstructured terrains. Our proposed method involves a terrain-aware sampling strategy for the robot's initial configuration and end-effector pose commands, as well as a game-based curriculum to extend the robot's operating range. We validate our approach on the ANYmal quadrupedal robot with a six DoF robotic arm. Through our experiments, we show that the learned controller achieves precise command tracking over a large workspace and adapts across varying terrains such as stairs and slopes. On deployment, it achieves a pose-tracking error of 2.64 cm and 3.64 degrees, outperforming existing competitive baselines.
Safe Output Feedback Improvement with Baselines
In data-driven control design, an important problem is to deal with uncertainty due to limited and noisy data. One way to do this is to use a min-max approach, which aims to minimize some design criteria for the worst-case scenario. However, a strategy based on this approach can lead to overly conservative controllers. To overcome this issue, we apply the idea of baseline regret, and it is seen that minimizing the baseline regret under model uncertainty can guarantee safe controller improvement with less conservatism and variance in the resulting controllers. To exemplify the use of baseline controllers, we focus on the output feedback setting and propose a two-step control design method; first, an uncertainty set is constructed by a data-driven system identification approach based on finite impulse response models; then a control design criterion based on model reference control is used. To solve the baseline regret optimization problem efficiently, we use a convex approximation of the criterion and apply the scenario approach in optimization. The numerical examples show that the inclusion of baseline regret indeed improves the performance and reduces the variance of the resulting controller.
comment: Accepted by The 63rd IEEE Conference on Decision and Control
Robust Neural IDA-PBC: passivity-based stabilization under approximations
In this paper, we restructure the Neural Interconnection and Damping Assignment - Passivity Based Control (Neural IDA-PBC) design methodology, and we formally analyze its closed-loop properties. Neural IDA-PBC redefines the IDA-PBC design approach as an optimization problem by building on the framework of Physics Informed Neural Networks (PINNs). However, the closed-loop stability and robustness properties under Neural IDA-PBC remain unexplored. To address the issue, we study the behavior of classical IDA-PBC under approximations. Our theoretical analysis allows deriving conditions for practical and asymptotic stability of the desired equilibrium point. Moreover, it extends the Neural IDA-PBC applicability to port-Hamiltonian systems where the matching conditions cannot be solved exactly. Our renewed optimization-based design introduces three significant aspects: i) it involves a novel optimization objective including stability and robustness constraints issued from our theoretical analysis; ii) it employs separate Neural Networks (NNs), which can be structured to reduce the search space to relevant functions; iii) it does not require knowledge about the port-Hamiltonian formulation of the system's model. Our methodology is validated with simulations on three standard benchmarks: a double pendulum, a nonlinear mass-spring-damper and a cartpole. Notably, classical IDA-PBC designs cannot be analytically derived for the latter.
comment: Preprint
Identification For Control Based on Neural Networks: Approximately Linearizable Models
This work presents a control-oriented identification scheme for efficient control design and stability analysis of nonlinear systems. Neural networks are used to identify a discrete-time nonlinear state-space model to approximate time-domain input-output behavior of a nonlinear system. The network is constructed such that the identified model is approximately linearizable by feedback, ensuring that the control law trivially follows from the learning stage. After the identification and quasi-linearization procedures, linear control theory comes at hand to design robust controllers and study stability of the closed-loop system. The effectiveness and interest of the methodology are illustrated throughout the paper on popular benchmarks for system identification.
comment: 15 pages, 3 figures, 6 tables, accepted as a poster in SysDO 2024, Stuttgart, Germany
Diffusion Models for Intelligent Transportation Systems: A Survey
Intelligent Transportation Systems (ITS) are vital in modern traffic management and optimization, significantly enhancing traffic efficiency and safety. Recently, diffusion models have emerged as transformative tools for addressing complex challenges within ITS. In this paper, we present a comprehensive survey of diffusion models for ITS, covering both theoretical and practical aspects. First, we introduce the theoretical foundations of diffusion models and their key variants, including conditional diffusion models and latent diffusion models, highlighting their suitability for modeling complex, multi-modal traffic data and enabling controllable generation. Second, we outline the primary challenges in ITS and the corresponding advantages of diffusion models, providing readers with a deeper understanding of the intersection between ITS and diffusion models. Third, we offer a multi-perspective investigation of current applications of diffusion models in ITS domains, including autonomous driving, traffic simulation, trajectory prediction, and traffic safety. Finally, we discuss state-of-the-art diffusion model techniques and highlight key ITS research directions that warrant further investigation. Through this structured overview, we aim to provide researchers with a comprehensive understanding of diffusion models for ITS, thereby advancing their future applications in the transportation domain.
comment: 7 figures
A Multi-Level Approach for Class Imbalance Problem in Federated Learning for Remote Industry 4.0 Applications
Deep neural network (DNN) models are effective solutions for industry 4.0 applications (\eg oil spill detection, fire detection, anomaly detection). However, training a DNN network model needs a considerable amount of data collected from various sources and transferred to the central cloud server that can be expensive and sensitive to privacy. For instance, in the remote offshore oil field where network connectivity is vulnerable, a federated fog environment can be a potential computing platform. Hence it is feasible to perform computation within the federation. On the contrary, performing a DNN model training using fog systems poses a security issue that the federated learning (FL) technique can resolve. In this case, the new challenge is the class imbalance problem that can be inherited in local data sets and can degrade the performance of the global model. Therefore, FL training needs to be performed considering the class imbalance problem locally. In addition, an efficient technique to select the relevant worker model needs to be adopted at the global level to increase the robustness of the global model. Accordingly, we utilize one of the suitable loss functions addressing the class imbalance in workers at the local level. In addition, we employ a dynamic threshold mechanism with user-defined worker's weight to efficiently select workers for aggregation that improve the global model's robustness. Finally, we perform an extensive empirical evaluation to explore the benefits of our solution and find up to 3-5% performance improvement than baseline federated learning methods.
Regional stability conditions for recurrent neural network-based control systems
In this paper we propose novel global and regional stability analysis conditions based on linear matrix inequalities for a general class of recurrent neural networks. These conditions can be also used for state-feedback control design and a suitable optimization problem enforcing H2 norm minimization properties is defined. The theoretical results are corroborated by numerical simulations, showing the advantages and limitations of the methods presented herein.
Reinforcement Leaning for Infinite-Dimensional Systems
Interest in reinforcement learning (RL) for massive-scale systems consisting of large populations of intelligent agents interacting with heterogeneous environments has witnessed a significant surge in recent years across diverse scientific domains. However, due to the large-scale nature of the system, the majority of state-of-the-art RL techniques either encounter high computational cost or exhibit compromised performance. To mitigate these challenges, we propose a novel RL architecture along with the derivation of effective algorithms to learn optimal policies for any arbitrarily large system of agents. Specifically, we model such a system as a parameterized control system defined on an infinite-dimensional function space. We then develop a moment kernel transform to map the parameterized system and the value function of an RL problem into a reproducing kernel Hilbert space. This transformation subsequently generates a finite-dimensional moment representation for this RL problem. Leveraging this representation, we develop a hierarchical algorithm for learning optimal policies for the infinite-dimensional parameterized system. We further enhance efficiency of the algorithm by exploiting early stopping at each hierarchy, by which we show the fast convergence property of the algorithm through constructing a convergent spectral sequence. The performance and efficiency of the proposed algorithm are validated using practical examples.
Optimization of partially isolated quantum harmonic oscillator memory systems by mean square decoherence time criteria
This paper is concerned with open quantum harmonic oscillators with position-momentum system variables, whose internal dynamics and interaction with the environment are governed by linear quantum stochastic differential equations. A recently proposed approach to such systems as Heisenberg picture quantum memories exploits their ability to approximately retain initial conditions over a decoherence horizon. Using the quantum memory decoherence time defined previously in terms of a fidelity threshold on a weighted mean-square deviation of the system variables from their initial values, we apply this approach to a partially isolated subsystem of the oscillator, which is not directly affected by the external fields. The partial isolation leads to an appropriate system decomposition and a qualitatively different short-horizon asymptotic behaviour of the deviation, which yields a longer decoherence time in the high-fidelity limit. The resulting approximate decoherence time maximization over the energy parameters for improving the quantum memory performance is discussed for a coherent feedback interconnection of such systems.
comment: 9 pages, 3 figures, submitted to ANZCC 2025
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC ICRA
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works that combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by integrating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we train an RL agent to solve the pose reaching task in simulation, then incorporate the trained neural network critic as both the stage and terminal cost of an MPC. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real wheel loader, where we successfully navigate to various goal poses. In contrast, the RL actor risked damaging the machine and was unsuitable for real-world use.
comment: Submitted to International Conference on Robotics and Automation (ICRA) 2025
Autotuning Bipedal Locomotion MPC with GRFM-Net for Efficient Sim-to-Real Transfer
Bipedal locomotion control is essential for humanoid robots to navigate complex, human-centric environments. While optimization-based control designs are popular for integrating sophisticated models of humanoid robots, they often require labor-intensive manual tuning. In this work, we address the challenges of parameter selection in bipedal locomotion control using DiffTune, a model-based autotuning method that leverages differential programming for efficient parameter learning. A major difficulty lies in balancing model fidelity with differentiability. We address this difficulty using a low-fidelity model for differentiability, enhanced by a Ground Reaction Force-and-Moment Network (GRFM-Net) to capture discrepancies between MPC commands and actual control effects. We validate the parameters learned by DiffTune with GRFM-Net in hardware experiments, which demonstrates the parameters' optimality in a multi-objective setting compared with baseline parameters, reducing the total loss by up to 40.5$\%$ compared with the expert-tuned parameters. The results confirm the GRFM-Net's effectiveness in mitigating the sim-to-real gap, improving the transferability of simulation-learned parameters to real hardware.
Open-/Closed-loop Active Learning for Data-driven Predictive Control
An important question in data-driven control is how to obtain an informative dataset. In this work, we consider the problem of effective data acquisition of an unknown linear system with bounded disturbance for both open-loop and closed-loop stages. The learning objective is to minimize the volume of the set of admissible systems. First, a performance measure based on historical data and the input sequence is introduced to characterize the upper bound of the volume of the set of admissible systems. On the basis of this performance measure, an open-loop active learning strategy is proposed to minimize the volume by actively designing inputs during the open-loop stage. For the closed-loop stage, an closed-loop active learning strategy is designed to select and learn from informative closed-loop data. The efficiency of the proposed closed-loop active learning strategy is proved by showing that the unselected data cannot benefit the learning performance. Furthermore, an adaptive predictive controller is designed in accordance with the proposed data acquisition approach. The recursive feasibility and the stability of the controller are proved by analyzing the effect of the closed-loop active learning strategy. Finally, numerical examples and comparisons illustrate the effectiveness of the proposed data acquisition strategy.
Agent-state based policies in POMDPs: Beyond belief-state MDPs
The traditional approach to POMDPs is to convert them into fully observed MDPs by considering a belief state as an information state. However, a belief-state based approach requires perfect knowledge of the system dynamics and is therefore not applicable in the learning setting where the system model is unknown. Various approaches to circumvent this limitation have been proposed in the literature. We present a unified treatment of some of these approaches by viewing them as models where the agent maintains a local recursively updateable agent state and chooses actions based on the agent state. We highlight the different classes of agent-state based policies and the various approaches that have been proposed in the literature to find good policies within each class. These include the designer's approach to find optimal non-stationary agent-state based policies, policy search approaches to find a locally optimal stationary agent-state based policies, and the approximate information state to find approximately optimal stationary agent-state based policies. We then present how ideas from the approximate information state approach have been used to improve Q-learning and actor-critic algorithms for learning in POMDPs.
Learning Linear Dynamics from Bilinear Observations
We consider the problem of learning a realization of a partially observed dynamical system with linear state transitions and bilinear observations. Under very mild assumptions on the process and measurement noises, we provide a finite time analysis for learning the unknown dynamics matrices (up to a similarity transform). Our analysis involves a regression problem with heavy-tailed and dependent data. Moreover, each row of our design matrix contains a Kronecker product of current input with a history of inputs, making it difficult to guarantee persistence of excitation. We overcome these challenges, first providing a data-dependent high probability error bound for arbitrary but fixed inputs. Then, we derive a data-independent error bound for inputs chosen according to a simple random design. Our main results provide an upper bound on the statistical error rates and sample complexity of learning the unknown dynamics matrices from a single finite trajectory of bilinear observations.
comment: 35 pages, 3 figures
Interaction Techniques for User-friendly Interfaces for Gate-based Quantum Computing
Quantum computers offer promising approaches to various fields. To use current noisy quantum computers, developers need to examine the compilation of a logical circuit, the status of available hardware, and noises in results. As those tasks are less common in classical computing, quantum developers may not be familiar with performing them. Therefore, easier and more intuitive interfaces are necessary to make quantum computers more approachable. While existing notebook-based toolkits like Qiskit offer application programming interfaces and visualization techniques, it is still difficult to navigate the vast space of quantum program design and hardware status. Inspired by human-computer interaction (HCI) work in data science and visualization, our work introduces four user interaction techniques that can augment existing notebook-based toolkits for gate-based quantum computing: (1) a circuit writer that lets users provide high-level information about a circuit and generates a code snippet to build it; (2) a machine explorer that provides detailed properties and configurations of a hardware with a code to load selected information; (3) a circuit viewer that allows for comparing logical circuit, compiled circuit, and hardware configurations; and (4) a visualization for adjusting measurement outcomes with hardware error rates.
comment: A poster accepted to IEEE QCE 2024
MBC: Multi-Brain Collaborative Control for Quadruped Robots
In the field of locomotion task of quadruped robots, Blind Policy and Perceptive Policy each have their own advantages and limitations. The Blind Policy relies on preset sensor information and algorithms, suitable for known and structured environments, but it lacks adaptability in complex or unknown environments. The Perceptive Policy uses visual sensors to obtain detailed environmental information, allowing it to adapt to complex terrains, but its effectiveness is limited under occluded conditions, especially when perception fails. Unlike the Blind Policy, the Perceptive Policy is not as robust under these conditions. To address these challenges, we propose a MBC:Multi-Brain collaborative system that incorporates the concepts of Multi-Agent Reinforcement Learning and introduces collaboration between the Blind Policy and the Perceptive Policy. By applying this multi-policy collaborative model to a quadruped robot, the robot can maintain stable locomotion even when the perceptual system is impaired or observational data is incomplete. Our simulations and real-world experiments demonstrate that this system significantly improves the robot's passability and robustness against perception failures in complex environments, validating the effectiveness of multi-policy collaboration in enhancing robotic motion performance.
comment: 18 pages, 9 figures, Website and Videos: https://quad-mbc.github.io/
Active Perception with Initial-State Uncertainty: A Policy Gradient Method
This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment.
Willems' Fundamental Lemma for Nonlinear Systems with Koopman Linear Embedding
Koopman operator theory and Willems' fundamental lemma both can provide (approximated) data-driven linear representation for nonlinear systems. However, choosing lifting functions for the Koopman operator is challenging, and the quality of the data-driven model from Willems' fundamental lemma has no guarantee for general nonlinear systems. In this paper, we extend Willems' fundamental lemma for a class of nonlinear systems that admit a Koopman linear embedding. We first characterize the relationship between the trajectory space of a nonlinear system and that of its Koopman linear embedding. We then prove that the trajectory space of Koopman linear embedding can be formed by a linear combination of rich-enough trajectories from the nonlinear system. Combining these two results leads to a data-driven representation of the nonlinear system, which bypasses the need for the lifting functions and thus eliminates the associated bias errors. Our results illustrate that both the width (more trajectories) and depth (longer trajectories) of the trajectory library are important to ensure the accuracy of the data-driven model.
Transformer based time series prediction of the maximum power point for solar photovoltaic cells
This paper proposes an improved deep learning based maximum power point tracking (MPPT) in solar photovoltaic cells considering various time series based environmental inputs. Generally, artificial neural network based MPPT algorithms use basic neural network architectures and inputs which do not represent the ambient conditions in a comprehensive manner. In this article, the ambient conditions of a location are represented through a comprehensive set of environmental features. Furthermore, the inclusion of time based features in the input data is considered to model cyclic patterns temporally within the atmospheric conditions leading to robust modeling of the MPPT algorithm. A transformer based deep learning architecture is trained as a time series prediction model using multidimensional time series input features. The model is trained on a dataset containing typical meteorological year data points of ambient weather conditions from 50 locations. The attention mechanism in the transformer modules allows the model to learn temporal patterns in the data efficiently. The proposed model achieves a 0.47% mean average percentage error of prediction on non zero operating voltage points in a test dataset consisting of data collected over a period of 200 consecutive hours resulting in the average power efficiency of 99.54% and peak power efficiency of 99.98%. The proposed model is validated through real time simulations. The proposed model performs power point tracking in a robust, dynamic, and nonlatent manner, over a wide range of atmospheric conditions.
comment: Published June 2022, in Energy Science and Engineering, Volume10, Issue9, Pages 3397-3410
Data-Driven System Identification of Quadrotors Subject to Motor Delays IROS 2024
Recently non-linear control methods like Model Predictive Control (MPC) and Reinforcement Learning (RL) have attracted increased interest in the quadrotor control community. In contrast to classic control methods like cascaded PID controllers, MPC and RL heavily rely on an accurate model of the system dynamics. The process of quadrotor system identification is notoriously tedious and is often pursued with additional equipment like a thrust stand. Furthermore, low-level details like motor delays which are crucial for accurate end-to-end control are often neglected. In this work, we introduce a data-driven method to identify a quadrotor's inertia parameters, thrust curves, torque coefficients, and first-order motor delay purely based on proprioceptive data. The estimation of the motor delay is particularly challenging as usually, the RPMs can not be measured. We derive a Maximum A Posteriori (MAP)-based method to estimate the latent time constant. Our approach only requires about a minute of flying data that can be collected without any additional equipment and usually consists of three simple maneuvers. Experimental results demonstrate the ability of our method to accurately recover the parameters of multiple quadrotors. It also facilitates the deployment of RL-based, end-to-end quadrotor control of a large quadrotor under harsh, outdoor conditions.
comment: Accepted at IROS 2024
A Fairness-Oriented Reinforcement Learning Approach for the Operation and Control of Shared Micromobility Services
As Machine Learning grows in popularity across various fields, equity has become a key focus for the AI community. However fairness-oriented approaches are still underexplored in smart mobility. Addressing this gap, our study investigates the balance between performance optimization and algorithmic fairness in shared micromobility services providing a novel framework based on Reinforcement Learning. Exploiting Q-Learning, the proposed methodology achieves equitable outcomes in terms of the Gini index across different areas characterized by their distance from central hubs. Through vehicle rebalancing, the provided scheme maximizes operator performance while ensuring fairness principles for users, reducing iniquity by up to 80% while only increasing costs by 30% (w.r.t. applying no equity adjustment). A case study with synthetic data validates our insights and highlights the importance of fairness in urban micromobility.
comment: 6 pages, 2 figures, jointly submitted to IEEE L-CSS and ACC 2025
USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions
Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)-AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework's feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. To accelerate relevant research in this field, we have made the simulation code available as open-source.
First Field Trial of LLM-Powered AI Agent for Lifecycle Management of Autonomous Driving Optical Networks
We design and demonstrate the first field trial of LLM-powered AI Agent for ADON. Three operation modes of the Agent are proposed for network lifecycle management. The Agent efficiently processes wavelength add/drop and soft/hard failures, and achieves comparable performance to human-designed algorithms for power optimization.
comment: Version submitted to ECOC PDP 2024 on September 6th
Will Large Language Models be a Panacea to Autonomous Driving?
Artificial intelligence (AI) plays a crucial role in autonomous driving (AD) research, propelling its development towards intelligence and efficiency. Currently, the development of AD technology follows two main technical paths: modularization and end-to-end. Modularization decompose the driving task into modules such as perception, prediction, planning, and control, and train them separately. Due to the inconsistency of training objectives between modules, the integrated effect suffers from bias. End-to-end attempts to address this issue by utilizing a single model that directly maps from sensor data to control signals. This path has limited learning capabilities in a comprehensive set of features and struggles to handle unpredictable long-tail events and complex urban traffic scenarios. In the face of challenges encountered in both paths, many researchers believe that large language models (LLMs) with powerful reasoning capabilities and extensive knowledge understanding may be the solution, expecting LLMs to provide AD systems with deeper levels of understanding and decision-making capabilities. In light of the challenges faced by both paths, many researchers believe that LLMs, with their powerful reasoning abilities and extensive knowledge, could offer a solution. To understand if LLMs could enhance AD, this paper conducts a thorough analysis of the potential applications of LLMs in AD systems, including exploring their optimization strategies in both modular and end-to-end approaches, with a particular focus on how LLMs can tackle the problems and challenges present in current solutions. Furthermore, we discuss an important question: Can LLM-based artificial general intelligence (AGI) be a key to achieve high-level AD? We further analyze the potential limitations and challenges that LLMs may encounter in promoting the development of AD technology.